If you are taking the higher-level Data Analyst and Data Scientist certifications, you will be asked to write a report, which is graded by human markers.
Our tip is to make it really easy for whoever will grade your work. Create a list with one point for each column. That way, the grader will be absolutely certain you have looked at every single column and won't be able to fail you.
Not only are you making the grading easier, but it is also easier for you to see what you have done and be certain you have checked every column.
Here is an example solution:
The original data is 200 rows and 9 columns. After validation, there were 198 rows remaining. The following describes what I did to each column:
- Region: There were 10 unique regions, as expected.
- Place name: There were 185 unique place names, suggesting that some names are duplicated, this should be confirmed with the team providing the data.
- Place type: There are only 4 values for each place type, Coffee Shop, Cafe, Espresson Bar and Others. This matches what is expected.
- Rating: Values range from 3.9 to 5.0, so all are within the range expected.
- Reviews: I removed rows where the Review value was missing. This was 2 rows, leaving 198 rows of data.
- Price: There are 3 price categories, as expected.
- Delivery option: There are 2 delivery options - True/False, as expected.
- Dine-in Option:I converted missing values to False, there were originally no false values.
- Takeaway option: I converted missing values to False, there were also originally no false values.