What is Data Science?
Data Science and machine learning are two in-demand and high earning professions that conceptually exist within the field of technology. Essentially, they both use data to help determine how we create and innovate products, services, infrastructural systems, and more. It uses knowledge from linear algebra, statistical modelling, and programming in order to create algorithms that extract knowledge, and insights from noisy, cluttered data, and apply the hindsight learned across several different domains. For example, in Data Science, you work with what’s known as big data which may be several thousands of images in Red Green Blue (RGB) layers (Image Recognition), this could be text data that was extracted from Facebook (Natural Language Processing), or this could be a company’s stock information (Quants).
This substack will even help you out with your interview preparation.
What is Machine Learning?
Machine Learning (ML) is an offshoot of artificial intelligence that uses a set of computer algorithms that build a model based upon input data, that then makes predictions based on output data. Additionally, they can be used to discover an underlying pattern within the data itself that is hard to see at times. ML algorithms are used consistently in your every day life. For example, a Convolutional Neural Network (CNN) is responsible for analyzing images, some algorithms that solve learning to rank problems are responsible for figuring out how rankings should be done, and some decision trees are used widely for making business decisions for optimizing revenue.
Which Programming Language Will You Use
Most of the people in the industry will tell you to focus on either R, or Python. The reality is if the position is more linear, with time series analysis, or is numeric heavy, then you will be using R. If the position is more generalized, and is more involved with working with the IT team, then you will most likely be using Python. Additionally, we strongly advise that you learn SQL, which is required for data extraction and will maximize your abilities and marketability in the workplace.
Why do we need Linear Algebra
Most of the algorithms that are used in ML rely heavily on using concepts taught in Linear Algebra. One example of ML found in everyday life is image compression. These algorithms work by using a component of linear algebra called Dimensionality Reduction, which is the basic premise when we take a chunk of data, and then determine where the most distinct, important pieces are located. Those pieces are then kept, and the remaining less important pieces are discarded, leaving the overall shape intact.
Why do we need Statistical Modelling
Some algorithms are better suited for certain types of data, and some are less efficient at dealing with certain type of data. To fully understand what the information the model is conveying to you, you will need to have a solid understanding of statistical data because when you peel back all the layers, Machine Learning is based upon statistical modelling.