If you were wondering what triggered this journey, it was all about my breakup with excel.
I was asked to do some analysis on a data set with 6 million rows, basically determining a US state for each row, using crap data, requiring an irriratingly long Case When statement in SQL and/or IF statement in excel. First I needed to join to tables that didn’t exist (like a table of US Zip codes) which required me to learn setting up local databases.
And when sql pro couldn’t export a query that size, I had to learn how to do everything from the command line. Which brings us to the story of how I earned my right to ask questions (or more appropriately, convinced people I’ve done the grunt work before I come to you for help).
It was a simple journey into the world of sed. Sed being one of the most useful first things people learn. Sed being a way to find and replace without opening a file. quite useful when your file is 6 million rows.
But of course what I wanted to do wasn’t simple and I found only a single post that explained this little trick.
What I was trying to do: Change a file from tab delimited (how mysql outputs a file) into comma delimited (how mysql needs it to import it). (And while i’m at it, why does mysql default it to tab delimited anyway when it’s just going to want it in CSV later? asshole.)
The problem: Figuring out why all the code that said ‘s/TAB/,/g’ didn’t work.
The solution: that code is for windows. if you’re on a mac (which i don’t know why you wouldn’t be), you have to replicate the Tab key with the keyboard shortcuts CTL+v Then CTL+i.
The bright side is, i learned a lot about the smart way to test errors, the head command, and became a lot better and quickly creating, editing and moving files around.
The lesson- Prefix all google searches with “OSX terminal” from now on.