Projects


The project is meant to give you experience with real data in the context of an unstructured exploration. There are only two hard requirements:

If you cannot find anything compelling to do in the Open Baltimore data, you can find other datasets, but you must get written approval (in an email) from the instructor to use them in your project. In particular, I'm not overly fond of Kaggle datasets as they tend to be overused with too much analysis and code lying around on the internet.

The final report should be single spaced, 12 point font, with 1 inch margins. Other than that, any style is fine. My expectation is that the report will contain at least the following:

In terms of length, given that there will be visualizations, tables, histograms, etc. in the document, a paper shorter than 10 pages will be met with some skepticism (see the discussion above about complexity). A paper longer than 30 pages may test the limits of the reader's attention span. Note that teams with more than one person should have tried more things. It is not the case that the length of the paper should be twice as long for teams of two as compared to an individual project. Rather, I'd expect the larger teams to explore more of the data, try more things, and present more insights.

In the end, the best projects will be ones that learn something interesting from the data. That is, if you tell me that the data says that most crime occurs after midnight on the weekends, I won't be surprised or find that particularly insightful. But if you tell me that crime patterns by weapon seem to move geographically with a weekly cycle (for example), I'd think that was pretty interesting.