What is Data Science?

Data Science and machine learning are two in-demand and high earning professions that conceptually exist within the field of technology. Essentially, they both use data to help determine how we create and innovate products, services, infrastructural systems, and more. It uses knowledge from linear algebra, statistical modelling, and programming in order to create algorithms that extract knowledge, and insights from noisy, cluttered data, and apply the hindsight learned across several different domains. For example, in Data Science, you work with what’s known as big data which may be several thousands of images in Red Green Blue (RGB) layers (Image Recognition), this could be text data that was extracted from Facebook (Natural Language Processing), or this could be a company’s stock information (Quants).

Data Science: Definition and Life-Cycle

This substack will even help you out with your interview preparation.

What is Machine Learning?

Machine Learning (ML) is an offshoot of artificial intelligence that uses a set of computer algorithms that build a model based upon input data, that then makes predictions based on output data. Additionally, they can be used to discover an underlying pattern within the data itself that is hard to see at times. ML algorithms are used consistently in your every day life. For example, a Convolutional Neural Network (CNN) is responsible for analyzing images, some algorithms that solve learning to rank problems are responsible for figuring out how rankings should be done, and some decision trees are used widely for making business decisions for optimizing revenue.

Why CAPTCHA Pictures Are So Unbearably Depressing | by Clive Thompson |  OneZero
When you solve a captcha, you are actually providing a ML algorithm some image input data to study and get better.

Which Programming Language Will You Use

Training your staff in data science? Here's how to pick the right programming  language | by John Ridpath | Towards Data Science
There are a lot of languages you can use, butt Python, R, and SQL are by far the most important

Most of the people in the industry will tell you to focus on either R, or Python. The reality is if the position is more linear, with time series analysis, or is numeric heavy, then you will be using R. If the position is more generalized, and is more involved with working with the IT team, then you will most likely be using Python. Additionally, we strongly advise that you learn SQL, which is required for data extraction and will maximize your abilities and marketability in the workplace.

Why do we need Linear Algebra

Most of the algorithms that are used in ML rely heavily on using concepts taught in Linear Algebra. One example of ML found in everyday life is image compression. These algorithms work by using a component of linear algebra called Dimensionality Reduction, which is the basic premise when we take a chunk of data, and then determine where the most distinct, important pieces are located. Those pieces are then kept, and the remaining less important pieces are discarded, leaving the overall shape intact.

4 : The problem of non-linear dimensionality reduction [63], as... |  Download Scientific Diagram
Image A has the highest amount of detail, where B is image A being compressed a bit. You can see the overall shape is the same, but it has a lot less quality.

Why do we need Statistical Modelling

Some algorithms are better suited for certain types of data, and some are less efficient at dealing with certain type of data. To fully understand what the information the model is conveying to you, you will need to have a solid understanding of statistical data because when you peel back all the layers, Machine Learning is based upon statistical modelling.

How to Interpret Regression Analysis Results: P-values and Coefficients
This is a simple summary of a linear regression, we will talk more about what all of the columns and the numbers mean.

Subscribe to Data Science & Machine Learning 101

By Data Professionals, for Data Professionals. This is your centralized Website that has all of your data professional needs: We cover: - Money Making Guides - Job Searching - Technical Skills (R, Python, SQL, MLOps, etc...) - Industry Knowledge

People

Writes about Data Science/AI/ML