Python2: Data Structures: Lists & Shallow vs Deep Copy
This deals with Python 3. We talk about the second data structure in Python called lists. We also introduce a concept called shallow copy and compare it with something called deep copy.
Table of Contents:
Lists
Shallow vs Deep Copy
Finishing Remarks
1 - Lists
The list data structure is another sequence type in Python. Lists are very similar to tuples because they both support indexing, slicing, checking for existence of an element, and being able to iterate of elements using a for loop.
However, the main difference is that unlike tuples, lists are mutable. This means that after a list has been created, you can change the value at a given index, even after the list has been created.
Where This is Used
For those that are interested in a career in Data Science (DS), and you are expecting to use Python. This means you will either work with numpy arrays, or pandas DataFrames. Doesn’t matter which, because at the end of the day, both of them relies on using lists to work with them.
Creating Lists
In Python, in order to create a list, we have 2 ways:
Option 1: Use the square brackets: [ ]
Option 2: Use the list() function
We can use the list() function on a tuple to convert a tuple to a list, or we can use it on a string to turn the string into a list, where every character in the string becomes it’s own separate element.
Option 3: Use the .split() function
We can use the .split function in order to split a string into a list, where whatever argument we pass to the .split() function is where the splits will happen.
Common List Functions
Index Notation
We can use the index notation to pull an element from a list.
In operator
We can use the in operator to see if an element exists inside a list. Here we check if 'weenis' exists inside our list abc.
List mutability
Recall that both strings & tuples are immutable. Meaning that once they are created, we cannot change them. However, lists are mutable and we can change them once they are created.
Doesn’t work:
Does work:
Insert
The .insert() function can be used to insert a new element at a specific index in a list. Here we will insert ‘abcd’ in the index located at 1.
Pop
The .pop() function takes 1 parameter, an index, and removes the value from the list at that index. It also stores the removed value to the variable, if you assign .pop() to a variable. Here we will pop 'weenis' and save it to the variable xyz.
Append
The .append() function is used to append a new element to the end of a list.
Extend
The .extend() function is used to add several new elements to the end of a list.
Min/Max/Sum
You can use the min(), max(), or sum function in order to get a quick aggregate calculation of the elements in the list. The exact same way, as we did for tuples.
2 - Shallow vs Deep Copy
Copying a List
Sometimes we want to copy 1 list to another. One simple way of doing it is just via assigning it by a variable.
The above image has us create a list called abc, and then we make a new variable called xyz, and have it point to abc.
If we try to change a value in abc, you will notice that the change done to abc also impacts xyz as well.
Shallow Copy
In the above example, you saw that changing our original list also impacts the copied duplicated list. This is because in the above example, we didn’t really create a new list, all we really did was just get the references of where elements are located from the first list. This is known as making a shallow copy.
Shallow copies duplicate as little as possible. A shallow copy of a list is just a collection of a copy of the structure, not the elements. With a shallow copy, 2 independent lists become linked with each other, and changing 1 element in 1 list, will apply the change with the other list as well.
Deep Copy
Top copy a list, and all its elements, you must make what’s known as a deep copy. A deep copy is a truly independent copy of an object. To make a deep copy of a list, you can use the deepcopy() function from the copy module.
Deep copies duplicate everything. A deep copy of a list is 2 collections with all of the elements in the original list duplicated. There are no references to keep track of, as the second list is literally just a brand new copy of the first.
Here is a simple visual you can use.
Fixing this headache
To fix this, all we would have to do is just use the deepcopy() function from Python’s inbuilt copy module/library. Observe below:
3 - Finishing Remarks
I highly recommend going over a lot of the list functionality, as interviewers love to ask a lot of rapid fire type questions about it. I would also recommend studying, and messing around with the deep copy vs shallow copy concept, as once again, it’s a super easy way for an interviewer to ding you.
Click here to go to Python3: Dictionaries.
great fundamentals look at lists in python