Hi fellow Excel lovers out there,
I've had some great experiences with Excel recently that I wanted to share with others. I'm sure there are other ways to do both of these things but I found them to be incredibly efficient for me!
One of the fun things I've been getting to do lately is do some data work, which hopefully will turn into a future endeavor for me to do full time. I like the statistical part of it and trends and numbers have been a bit of a hobby for me over the past couple years.
Power Query is awesome!!
One of the best discoveries I've made recently was that of "Power Query" and the incredible data analytical and organizational things it has encompassed in it.
To launch Power Query for the excel workbook you are using, for Excel 2013 on a Windows 10 64 bit computer using 32 bit software, (I think, it's enterprise Office 365, kind of hard to tell outside of screenshots, I tried looking it up) that I'm using, open the sheet and navigate to:
Data > Get Data > (I have been using "From File > Workbook" but I know that it can do from folder to get data dumps added to a folder and automatically update the query. I have not been successful in getting this to work.. would love to figure out what I'm doing wrong at some point, but not critical).
I am learning how to use Tableau, another data analytical piece of software so I am using sample data freely available to download from their site, for this post. I really enjoy my job so I have to use fake data rather than what I'm really using, they take confidentiality really seriously! (for good reason). I was able to still get some of the basics of what I was doing with my real data, to apply to this data set as well.
From there select the file you wish to launch in Power Query.
Once you find the file, you choose which sheet you'd like to edit. I had two sheets that I was working with, with my real data. One was the raw data (my wife really made sure to let me know how important it was to keep the raw data in case something screws up!) and the other one was the semi-formatted data. I pulled out large chunks of the data because there was a fair amount in there I really didn't care about so I wanted to filter it up front so I didn't have to filter it later.
This is a small snip from the real data I was working with. I was focusing my efforts on determining turn around times (TAT). One of the biggest things is to have good TAT numbers, so I spent a bunch of time figuring out how to get them the data they wanted for a few months. Given the date and time necessary, it led my efforts to find out better ways to calculate it easily with Excel. Once I figured it out, boy was I happy once I did!
From the fake set of data I am working with, I was able to pull some things to mimic what I wanted to do with my real data.
One thing to note, it wasn't relevant with my files this time, but sometimes when you extract an excel file from the application you are getting it from, it contains a few preliminary lines of junk you don't want at the top of the sheet. (extracted by whom, on date, for example)
You want to delete the rows of unnecessary information and the "use first row as headers" and it will then pull up those first rows you want. It happens with one file I extract so I have to use it occasionally, though not recently. Good tip to know!
With my real data, I needed to duplicate columns a couple times to get some calculations done while keeping the existing columns accurate if needed. This also allows for transforming the columns later on. To duplicate a column, right click on it and select "Duplicate Column"
Once you duplicate the columns, it appears at the far-right of the sheet, seen above.
One other thing to keep in mind is the column type. This took me a little bit to figure out but column type is important because it is how Excel figures out what to do with the columns. For example in the little snip of my screen, you can see "Shipping Cost" is numerical, represented by the "1.2" figures. This tells Excel that you can do numerical calculations with this column. Next to it is "Order Priority" which is using the "Text" type, meaning that it will only allow you to do functions that are text based. Finally, the columns I copied I had to transform their type from just "Date" to "Date/Time" so that I can properly display what I am doing. The TAT I am interested in goes down to the minute, so having this was critical in figuring out what I needed.
As you can see in the above screenshot, I had multiple steps I needed to take in order to get the data formatted to what I really needed. This required each step being repeated for each new file I had. The files all contained the same columns thankfully, just data for different months. They also unfortunately had to be separate files because each one was discrete. Doing all of the calculations and formatting on the first file was a bit time consuming, as it took about 1.5 hours to get all of the queries I needed set up properly. I enjoyed it and took it as a learning experience!
Now for the fun part, after all that pain staking manipulation was done! I was scouring the internet for ways to get the modifications I made to go from one file to the next. There were some ways that I had tried but it wasn't very intuitive and ultimately not successful. I actually was poking around myself trying to determine how to do it when I figured "why not?" and tried the advanced editor. I had opened it once before just to see what it was about (I love just clicking around things in programs I am trying to learn, for me that's the best way to learn!) and thought I'd give it a try here. I knew the columns in the data I was using were the same so I was thinking that it should be applicable for what I wanted to do here too. It was!
When you have the advanced editor open (removed my username, sorry folks!) you are going to want to copy everything below the "Source" line because this will change between files naturally. Ideally the first sheet should be the same name, but not always. I ran into this issue with my first go and was very proud when I figured out how to fix it and correct my error without frantically consulting the interwebs on what I did wrong. I had my first sheet named something ending in "All" (Data All for example) in the source file, and in the file I wanted to paste the steps I took into, I had it named "Formatted" and was upset that it didn't like the error!
It gave me the above error and I wanted to yell at something, fearing it wouldn't let me do what I wanted. I breathed and closed the power query editor without saving changes and opened it back up so that what I did didn't actually save. When in doubt, close without saving and restart I've learned... I regrouped and discovered that I had the naming convention between the two files messed up. I corrected it and got wicked excited that it did what I wanted it to! Perhaps only those who use Excel often would become elated for such a scenario, but I didn't care lol. All of the work I did to the other sheet, duplicating columns, transforming, renaming, it all applied itself to the new data in a blink! I was ecstatic!
Stay tuned next week for the second edition of this post, I didn't have the time to complete it today!
Let me know how you use it!
Are you an excel user? Let me know how you use it, I'd love to talk about it and see and learn things from you! Was my little guide easy to follow? Let me know if there are improvements on explaining something. I would also be interested to find out an easier way to do this if possible, just to learn something more. Can't hurt to try.
I know @paulag is an excel wiz on here but I'd love to find some others who have an affinity for number crunching and analytics! Consider following her other page, https://steempeak.com/@theexcelclub for more!
Did you enjoy this post or my other posts? Please consider following me for more! @cmplxty