Linux is one of the most customizable and useful operating systems out there. Due to the relaxed constraints on how the system operates, Linux is primarily used on the computer science and data science fields.
Linux is useful in many different applications. In data science specifically, the ease of workflow automation makes Linux extremely powerful. It it important to note that every operating system has it’s limitations, Windows and MacOS included. Linux may not be the right fit for everyone. Fun fact: MacOS is actually a derivative of Linux.
This tutorial revolves around how to install Linux on a virtual machine. You can just as easily install Linux side-by-side with your own operating system. I actually do this on my desktop, specifying a whole drive for Linux. On my laptop however, I only have one drive. I would rather not play around with disk partitioning or dual booting since it is quite easy to irreversibly change something you didn’t want to change. Instead, I run Linux through a virtual machine.
Virtual machines are a phenomenal work-around to dual booting to operating systems. You preserve the drivers that your laptop needs to run (GPUs, USB devices, etc.) while still providing the functionality you desire though a different operating system. The first step is to install a virtual machine, in this case: Virtual Box. You are more than free to use others too, VMWare is also quite good.
The next step is to install your Linux distribution. I tend to stick to the tried-and-tested Ubuntu LTS releases. This is the most widely-used Linux distribution and the LTS tag means that the Canonical community will provide the distribution continued support for 5 years after the release date. The next LTS release is scheduled for April 2018. The download link can be found here: Ubuntu
Installing Your Virtual Machine
To install your Linux image onto your virtual machine, first open up you virtual machine program. In this case, Virtual Box.
Go ahead and click the “New” icon in the corner and give your machine a name.
Give you machine the appropriate amount of RAM to run smoothly. In this case I have 16GB of RAM so I allocate at least 50% of that to my machine. I added another 12.5% for additional capacity.
WARNING: If you allocate too much RAM, your computer will slow down or freeze. Your base operating system (Windows, MacOS) requires a certain amount of RAM to operate on its own.
Go ahead and create a virtual hard disk for storage. We can leave the settings the same for most of the installation.
I tend to create a fixed size of hard disk for my machines. The fixed size is often faster than the dynamic size, but both work well. Choose your preference.
Select how much storage your hard disk should contain. In my case I have 1TB of space and 10% (100GB) of that is more than enough. If you are pressed for space, you really only need around 15GB for all of your general purpose data science needs.
Now your virtual machine is set up!
The next step is to connect your downloaded Linux image to your created virtual machine. Click on the “Settings” icon and navigate to the “Storage” tab.
Click on the disk icon in the “Attributes” menu and select where your Linux image is located (.iso file).
Your linux image is now attached and you can run your machine!
Exit out of the “Settings” menu and click “Run” to start up your Linux virtual machine.
We are almost done! All we have to do now is install Ubuntu onto the virtual machine.
First, we will need to select our preferred language.
Next, we will choose whether to install updates and third-party software drivers/programs. I tend to select both of these so I don’t have to do this manually later.
You will notice the next screen states that there is no operating system on this machine. This is working as intended! Your virtual machine should have no operating systems on it!
Go ahead and select “Erase disk and install Ubuntu”. There is technically nothing to erase so no harm done.
WARNING: Do not do this if you are installing Ubuntu in a dual boot setting. This will erase your entire disk.
Select your location. If you wish to remain private, select any city in your timezone so the clock will make sense.
Select your keyboard layout. English (US) is the standard keyboard layout for the US and Canada.
Finally, give your “computer” a name, a password, and a user. In this example, my username is my first name and the computer’s name is “indiana-VM-sample”.
Ubuntu will take a few minutes to install onto your machine. Afterwards, you machine is ready to go!
To install the data science software stack, you can find the instructions on my Software Stack blog post.