Sharing the link to a great article on the “7 Best Practices” of Data Management. In my first post, I had mentioned how zillions and zillions of data is being collected on us continuously. As a consequence, Data Science is a rapidly expanding and important field. This article is a great introduction to it. 🙂
*****************
Data management is a broad topic that encompasses the whole life cycle of data. It includes the extraction, transformation, and loading of data (ETL), data storage, data wrangling and cleaning, data analysis, data visualization, data governance, information security, data mining, and modeling. Successful data management involves careful attention to every step of the data processing, from choosing which data to collect, to using it to make predictions. Below are the 7 best practices I stress when advising clients.
1 – Start with a business question you need to answer
Begin by setting goals for your data. What kind of insights do you wish to uncover from it? How much of your data can you reliably and repeatably collect? Brainstorm with a multidisciplinary team and note the needs of each member, then prioritize your data fields. Your data should not be cumbersome to explore. Also, if you are collecting data from people, bear in mind that it’s often poor business practice to overwhelm them with questions. Prioritize your metrics and select the most relevant ones for collection.