Fix the Git completion with Oh-my-Zsh on Mac

In a previous post, I have explained how I have setup oh-my-zsh with the git plugin. I am also using homebrew to manage the packages installed on my Mac. After upgrading Git recently, I have noticed the Git completion was not as powerful anymore.

Continue reading

MongoDB and Apache Spark - Getting started tutorial

MongoDB and Apache Spark are two popular Big Data technologies.

In my previous post, I listed the capabilities of the MongoDB connector for Spark. In this tutorial, I will show you how to configure Spark to connect to MongoDB, load data, and write queries.

To demonstrate how to use Spark with MongoDB, I will use the zip codes from MongoDB tutorial on the aggregation pipeline documentation using a zip code data set. I have prepared a Maven project and a Docker Compose file to get you started quickly.

Continue reading

Introduction to the MongoDB connector for Apache Spark

MongoDB is one of the most popular NoSQL databases. Its unique capabilities to store document-oriented data using the built-in sharding and replication features provide horizontal scalability as well as high availability.

Apache Spark is another popular “Big Data” technology. Spark provides a lower entry level to the world of distributed computing by offering an easier to use, faster, and in-memory framework than the MapReduce framework. Apache Spark is intended to be used with any distributed storage, e.g. HDFS, Apache Cassandra with the Datastax’s spark-cassandra-connector and now the MongoDB’s connector presented in this article.

By using Apache Spark as a data processing platform on top of a MongoDB database, you can benefit from all of the major Spark API features: the RDD model, the SQL (HiveQL) abstraction and the Machine Learning libraries.

In this article, I present the features of the connector and some use cases. An upcoming article will be a tutorial to demonstrate how to load data from MongoDB and run queries with Spark.

Continue reading

Add a Git hook to automatically verify a repository's email

I use Git a lot and I often have to switch between my personal repositories (ie: Github) and my professional (Ippon) repositories on the same laptop. My default Git email is configured to my personal email and I have often forgotten to configure it to my professional email when creating/cloning a repository for my company. Like everything in Git, this can be automated to avoid mistakes.

In this post, I will show how I use a Git hook to check the email configured in any repository before every commit.

Continue reading

My development environment

A customized development environment could be a huge productivity boost in the day to day work. In this post, I will share the tools and configurations I currently use.

Continue reading

git commit fixup

In this article, I will describe a git option to quickly fix a previous commit. This sometimes happens when I want to fix a typo in a previous commit after few new commits. The goal is to keep a “clean” git history with consistent commits adding features to facilitate the code reviews.

Continue reading