For the first project of the big data analytics programming course, we implemented several algorithms for efficiently dealing with big data in c++. We implemented a perceptron model (the building block of the neural network) and naive bayes with two variants to deal with infinite features (min-count and feature hashing). Efficiency was a major requirement (as well as model correctness), and the c++ code was written to optimize memory usage and runtime through efficient use of pointers and references.

  • skills: C++, Efficient Programming, Big Data
  • status: Completed (November 2023)