24
How to import data from spreadsheet easily and productive
This post is for developers, who just got a huge spreadsheet of data and have to import them to the application. Recently I was in that position and I have a couple of takeaways how to make the process easier, faster and less painful.
Non-technical people in general see data much different from a developer. You will have to remove duplicates, standardize fields, maybe add additional fields and probably remove a lot of useless data. This step is to make the parsing easier and remember, that most of the time it's easier to change data in the document, than making sophisticated mechanisms in the code.
IsPacketA
, IsPacketB
, etc).,
worked best for me).Working with plain text is much easier, than parsing rows and cells. In C# there is a super simple NuGet called CsvHelper that will load all data in a few lines of code. Make sure that delimiter may conflict with your data - for me, semicolon worked perfectly. I tried working with TSV but in my case it was missing new lines and some empty columns.
Now the data is ready to be loaded in the code. Before developing anything, watch for new lines and special characters - they might be parsed incorrectly and/or cause the application to crash. Maybe you will have to go back to fix the data in the spreadsheet but it much more time-consuming now, than at the later stages.
Store id's/keys in the spreadsheet from step 1. This will help work with the changes at later stages - it will be easier to locate row by key in the database to update it. Guid generator will help you with generating unique values.
Having two sources of data is a nightmare. You will encounter situations, where you will have to investigate, which document is newer, merge changes, resolve conflicts... To avoid this, explain your "fixed document" (step 1) to people who created the source document.
Google Sheets makes it easy with version history. Maybe think about automating the process.