Mục lục bài viết
Update: 2021-12-27 22:22:06,Bạn Cần kiến thức và kỹ năng về Condition in list Python. Bạn trọn vẹn có thể lại Thảo luận ở phía dưới để Ad đc tương hỗ.
How can you filter a list in Python using an arbitrary condition? The most Pythonic and most performant way is to use list comprehension [x for x in list if condition] to filter all elements from a list.
Tóm lược đại ý quan trọng trong bài
Table of Contents
The most Pythonic way of filtering a listin my opinionis the list comprehension statement [x for x in list if condition]. You can replace condition with any function of x you would like to use as a filtering condition.
For example, if you want to filter all elements that are smaller than, say, 10, youd use the list comprehension statement [x for x in list if x<10] to create a new list with all list elements that are smaller than 10.
Here are three examples of filtering a list:
lst = [8, 2, 6, 4, 3, 1]
# Filter all elements <8
small = [x for x in lst if x<8]
print(small)
# Filter all even elements
even = [x for x in lst if x%2==0]
print(even)
# Filter all odd elements
odd = [x for x in lst if x%2]
print(odd)
The output is:
# Elements <8
[2, 6, 4, 3, 1]
# Even Elements
[8, 2, 6, 4]
# Odd Elements
[3, 1]
This is the most efficient way of filtering a list and its also the most Pythonic one. If you look for alternatives though, keep reading because Ill explain to you each and every nuance of filtering lists in Python in this comprehensive guide.
The filter(function, iterable) function takes a function as input that takes on argument (a list element) and returns a Boolean value whether this list element should pass the filter. All elements that pass the filter are returned as a new iterable object (a filter object).
You can use the lambda function statement to create the function right where you pass it as an argument. The syntax of the lambda function is lambda x: expression and it means that you use x as an input argument and you return expression as a result (that can or cannot use x to decide about the return value). For more information, see my detailed blog article about the lambda function.
lst = [8, 2, 6, 4, 3, 1]
# Filter all elements <8
small = filter(lambda x: x<8, lst)
print(list(small))
# Filter all even elements
even = filter(lambda x: x%2==0, lst)
print(list(even))
# Filter all odd elements
odd = filter(lambda x: x%2, lst)
print(list(odd))
The output is:
# Elements <8
[2, 6, 4, 3, 1]
# Even Elements
[8, 2, 6, 4]
# Odd Elements
[3, 1]
The filter() function returns a filter object thats an iterable. To convert it to a list, you use the list(…) constructor.
Related article:
I just add this option because people are still trying to use the map() function to filter out elements from a list. This is clearly the wrong way of doing it. The reason is that the map() function allows you only to transform each element of a list into a new element. But youll still have the same number of elements in the list. Therefore, you need to have an extra step of filtering out all elements (for example, by using list comprehension). But if youre ready to take this extra step, you could also use list comprehension for filtering in the first place.
Heres what I mean:
lst = [8, 2, 6, 4, 3, 1]
# Filter all elements <8
small = list(map(lambda x: x if x<8 else None, lst))
small = [x for x in small if x!=None]
print(small)
# Filter all even elements
even = list(map(lambda x: x if x%2==0 else None, lst))
even = [x for x in even if x!=None]
print(even)
# Filter all odd elements
odd = list(map(lambda x: x if x%2 else None, lst))
odd = [x for x in odd if x!=None]
print(odd)
The output is again the same:
[2, 6, 4, 3, 1]
[8, 2, 6, 4]
[3, 1]
But the method of getting this output is clearly ineffective and not readable.
Related articles:
A generator expression creates an iterator over a sequence of values. It works just like list comprehensionbut without creating a list data type. This is a bit more efficient.
You can use generator expressions in any function call that requires an iterable as input. For example, if you want to calculate the sum of all values that meet a certain condition in a list.
Heres a code example that shows how to sum over all integer values in a list (and ignoring the rest) using a generator expression to filter out those non-integers:
lst = [6, 8, 2, 8, ‘Alice’]
print(sum(x for x in lst if type(x) == int))
# 24
You check the type(x) of each element and compare it against the integer type. This comparison returns True if the element is, in fact, of type integer.
You can define any complicated condition on a list element to decide whether to filter it out or not. Just create your own function (e.g., condition(x)) that takes one list element as input and returns the Boolean value True if the condition is met or False otherwise.
Heres a code example:
def condition(x):
”’Define your arbitrarily
complicated condition here”’
return x0
lst = [11, 14, 3, 0, -1, -3]
# Filter out all elements that do
# not meet condition
filtered = [x for x in lst if condition(x)]
print(filtered)
# [3]
All elements that are smaller than 10 and larger than 0 are included in the filtered list. Thus, only one element 3 remains.
The same applies if you want to combine multiple conditions. Say, you want to filter out all elements x>9 and x<1. These are two (simple) conditions. You can define any complicated condition on a list element to decide whether to filter it out or not. To do this, create a function (e.g., condition(x)) that takes one list element as input and returns the Boolean value True if the condition is met or False otherwise.
Heres the same code example as before:
def condition(x):
”’Define your arbitrarily
complicated condition here”’
return x0
lst = [11, 14, 3, 0, -1, -3]
# Filter out all elements that do
# not meet condition
filtered = [x for x in lst if condition(x)]
print(filtered)
# [3]
All elements that are smaller than 10 and larger than 0 are included in the filtered list. Thus, only one element 3 remains.
Problem: Given a list of strings. How can you filter those that match a certain regular expression?
Example: Say, youve got a list [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’] and you want to filter out those elements that do not meet the regex pattern ‘A.*e’. You expect the filtered list to be [‘Alice’, ‘Anne’].
Solution: Use the list comprehension filtering framework [x for x in list if match] to filter out all elements that do not match a given string.
import re
# Define the list and the regex pattern to match
customers = [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’]
pattern = ‘A.*e’
# Filter out all elements that match the pattern
filtered = [x for x in customers if re.match(pattern, x)]
print(filtered)
# [‘Alice’, ‘Anne’]
You use the re.match() method that returns a match object if theres a match or None otherwise. In Python, any match object evaluates to True (if needed) apart from some exceptions (e.g., None, 0, 0.0, etc.). If you need to refresh your basic understanding of the re.match() function, check out my detailed blog article that leads you step by step into this powerful Python tool.
Related articles:
Short answer: To filter a list of lists for a condition on the inner lists, use the list comprehension statement [x for x in list if condition(x)] and replace condition(x) with your filtering condition that returns True to include inner list x, and False otherwise.
Lists belong to the most important data structures in Pythonevery master coder knows them by heart! Surprisingly, even intermediate coders dont know the best way to filter a listlet alone a list of lists in Python. This tutorial shows you how to do the latter!
Problem: Say, youve got a list of lists. You want to filter the list of lists so that only those inner lists remain that satisfy a certain condition. The condition is a function of the inner listsuch as the average or sum of the inner list elements.
Example: Given the following list of lists with weekly temperature measurements per weekand one inner list per week.
# Measurements of a temperature sensor (7 per week)
temperature = [[10, 8, 9, 12, 13, 7, 8], # week 1
[9, 9, 5, 6, 6, 9, 11], # week 2
[10, 8, 8, 5, 6, 3, 1]] # week 3
How to filter out the colder weeks with average temperature value <8? This is the output you desire:
print(cold_weeks)
# [[9, 9, 5, 6, 6, 9, 11], [10, 8, 8, 5, 6, 3, 1]]
There are two semantically equivalent methods to achieve this: list comprehension and the map() function.
Related articles:
Problem: Given a list of strings and a query string. How can you filter those that contain the query string?
Example: Say, youve got a list [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’] and you want to obtain all elements that contain the substring ‘An’. You expect the filtered list to be [‘Anne’, ‘An’].
Solution: Use the list comprehension filtering framework [x for x in list if condition] to filter out all elements that do not contain another string.
import re
# Define the list
customers = [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’]
# Filter out all elements that contain ‘An’
filtered = [x for x in customers if ‘An’ in x]
print(filtered)
# [‘Anne’, ‘Ann’]
You use the basic string membership operation in to check if an element passes the filter or not.
Problem: Given a list of strings. How can you remove all empty strings?
Example: Say, youve got a list [‘Alice’, ‘Anne’, ”, ‘Ann’, ”, ‘Tom’] and you want to get a new list [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’] of non-empty strings.
Solution: Use the list comprehension filtering framework [x for x in list if x] to filter out all empty strings.
import re
# Define the list
customers = [‘Alice’, ‘Anne’, ”, ‘Ann’, ”, ‘Tom’]
# Filter out all elements that contain ‘An’
filtered = [x for x in customers if x]
print(filtered)
# [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’]
You use the property that Python assigns the Boolean value ‘False’ to the empty string ”.
Problem: Given a list of strings. How can you filter those that starts with another string (or end with another string)? In other words, you want to get all strings that have another string as prefix or suffix.
Example: Say, youve got a list [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’] and you want to obtain all elements that starts with ‘An’. You expect the filtered list to be [‘Anne’, ‘An’].
Solution: Use the list comprehension filtering framework [x for x in list if x.startswith(‘An’)] to filter out all elements that starts with ‘An’. If you want to check strings that end with another string, you can use the str.endswith() function instead.
import re
# Define the list
customers = [‘Alice’, ‘Anne’, ‘Ann’, ‘Tom’]
# Filter out all elements that start with ‘An’
filtered = [x for x in customers if x.startswith(‘An’)]
print(filtered)
# [‘Anne’, ‘Ann’]
# Filter out all elements that end with ‘e’
filtered = [x for x in customers if x.endswith(‘e’)]
print(filtered)
# [‘Alice’, ‘Anne’]
You use the startswith() and endswith() functions as filtering conditions.
The filter(function, iterable) function takes a filter function as an argument that takes one list element as input and returns the Boolean value True if the condition is met or False otherwise. This function decides whether an element is included in the filtered list or not.
To define this function, you can use the lambda keyword. The lambda function is an anonymous functionjust think of it as a throw-away function thats only needed as an argument and for nothing else in the code.
Heres the code that shows how to filter a list using the lambda function to filter a list and returning only the odd values in the list:
# Create the list
lst = [1, 2, 3, 4]
# Get all odd values
print(list(filter(lambda x: x%2, lst)))
# [1, 3]
The lambda function lambda x: x%2 takes one argument xthe element to be checked against the filterand returns the result of the expression x%2. This modulo expression returns 1 if the integer is odd and 0 if it is even. Thus, all odd elements pass the test.
Problem: Given a list of values lst and a list of Booleans filter. How to filter the first list using the second list? More precisely, you want to create a new list that includes the i-th element of lst if the i-th element of filter is True.
Example: Here are two example lists:
lst = [1, 2, 3, 4]
filter_lst = [True, False, False, True]
And you want to obtain this list:
[1, 4]
Solution: Use a simple list comprehension statement [lst[i] for i in range(len(lst)) if filter_lst[i]] which checks for each index i whether the corresponding filter Boolean value is True. In this case, you add the element at index i in lst to the new filtered list. Heres the code:
lst = [1, 2, 3, 4]
filter_lst = [True, False, False, True]
res = [lst[i] for i in range(len(lst)) if filter_lst[i]]
print(res)
# [1, 4]
The Boolean list serves as a mask that determines which element passes the filter and which does not.
An alternative is to use the zip() function to iterate over multiple sequences without needing to touch any index:
lst = [1, 2, 3, 4]
filter_lst = [True, False, False, True]
res = [x for (x, boo) in zip(lst, filter_lst) if boo]
print(res)
# [1, 4]
Do you need to brush up your zip() understanding? Check out our in-depth blog article!
Problem: Given a list of values and a list of indices. How to filter all elements with indices in the second list?
Example: Youve list [‘Alice’, ‘Bob’, ‘Ann’, ‘Frank’] and the indices [1, 2]. Youre looking for the filtered list [‘Bob’, ‘Ann’].
Solution: Go over all indices in the second list and include the corresponding list elements using a simple list comprehension statement [lst[i] for i in indices].
lst = [‘Alice’, ‘Bob’, ‘Ann’, ‘Frank’]
indices = [1, 2]
res = [lst[i] for i in indices]
print(res)
# [‘Bob’, ‘Ann’]
Only two elements with indices 1 and 2 pass the filter.
Problem: Given a list of dictionaries. Each dictionary consists of one or more (key, value) pairs. You want to filter them by value of a particular dictionary key (attribute). How do you do this?
Minimal Example: Consider the following example where youve three user dictionaries with username, age, and play_time keys. You want to get a list of all users that meet a certain condition such as play_time>100. Heres what you try to accomplish:
users = [‘username’: ‘alice’, ‘age’: 23, ‘play_time’: 101,
‘username’: ‘bob’, ‘age’: 31, ‘play_time’: 88,
‘username’: ‘ann’, ‘age’: 25, ‘play_time’: 121,]
superplayers = # Filtering Magic Here
print(superplayers)
The output should look like this where the play_time attribute determines whether a dictionary passes the filter or not, i.e., play_time>100:
[‘username’: ‘alice’, ‘age’: 23, ‘play_time’: 101,
‘username’: ‘ann’, ‘age’: 25, ‘play_time’: 121]
Solution: Use list comprehension [x for x in lst if condition(x)] to create a new list of dictionaries that meet the condition. All dictionaries in lst that dont meet the condition are filtered out. You can define your own condition on list element x.
Heres the code that shows you how to filter out all user dictionaries that dont meet the condition of having played at least 100 hours.
users = [‘username’: ‘alice’, ‘age’: 23, ‘play_time’: 101,
‘username’: ‘bob’, ‘age’: 31, ‘play_time’: 88,
‘username’: ‘ann’, ‘age’: 25, ‘play_time’: 121,]
superplayers = [user for user in users if user[‘play_time’]>100]
print(superplayers)
The output is the filtered list of dictionaries that meet the condition:
[‘username’: ‘alice’, ‘age’: 23, ‘play_time’: 101,
‘username’: ‘ann’, ‘age’: 25, ‘play_time’: 121]
Related articles on the Finxter blog:
How to remove all duplicates of a given value in the list?
The naive approach is to go over each element and check whether this element already exists in the list. If so, remove it. However, this takes a few lines of code.
A shorter and more concise way is to create a dictionary out of the elements in the list. Each list element becomes a new key to the dictionary. All elements that occur multiple times will be assigned to the same key. The dictionary contains only unique keysthere cannot be multiple equal keys.
As dictionary values, you simply take dummy values (per default).
Related blog articles:
Then, you simply convert the dictionary back to a list throwing away the dummy values. As the dictionary keys stay in the same order, you dont lose the order information of the original list elements.
Heres the code:
>>> lst = [1, 1, 1, 3, 2, 5, 5, 2]
>>> dic = dict.fromkeys(lst)
>>> dic
1: None, 3: None, 2: None, 5: None
>>> duplicate_free = list(dic)
>>> duplicate_free
[1, 3, 2, 5]
Filter all elements in a list that fall into a range of values between given start and stop indices.
lst = [3, 10, 3, 2, 5, 1, 11]
start, stop = 2, 9
filtered_lst = [x for x in lst if x>=start and x<=stop]
print(filtered_lst)
# [3, 3, 2, 5]
You use the condition x>=start and x<=stop to check if the element list x falls into the range [start, stop] or not.
Filter all elements in a list that are greater than a given value y.
lst = [3, 10, 3, 2, 5, 1, 11]
y = 2
filtered_lst = [x for x in lst if x>y]
print(filtered_lst)
# [3, 10, 3, 5, 11]
You use the condition x>y to check if the list element x is greater than y or not. In the former case, its included in the filtered list. In the latter case, its not.
You can use the same idea with the less than operator < via the list comprehension statement [x for x in lst if x<y].
How can you count elements under a certain condition in Python? For example, what if you want to count all even values in a list? Or all prime numbers? Or all strings that start with a certain character? There are multiple ways to accomplish this, lets discuss them one by one.
Say, you have a condition for each element x. Lets make it a function with the name condition(x). You can define any condition you wantjust put it in your function. For example this condition returns True for all elements that are greater than the integer 10:
def condition(x):
return x > 10
print(condition(10))
# False
print(condition(2))
# False
print(condition(11))
# True
But you can also define more complicated conditions such as checking if they are prime numbers.
How can you count the elements of the list IF the condition is met?
The answer is to use a simple generator expression sum(condition(x) for x in lst):
>>> def condition(x):
return x>10
>>> lst = [10, 11, 42, 1, 2, 3]
>>> sum(condition(x) for x in lst)
2
The result indicates that there are two elements that are larger than 10. You used a generator expression that returns an iterator of Booleans. Note that the Boolean True is represented by the integer value 1 and the Boolean False is represented by the integer value 0. Thats why you can simply calculate the sum over all Booleans to obtain the number of elements for which the condition holds.
If you want to determine the number of elements that are greater than or smaller than a specified value, just modify the condition in this example:
>>> def condition(x):
return x>10
>>> lst = [10, 11, 42, 1, 2, 3]
>>> sum(condition(x) for x in lst)
2
For example, to find the number of elements smaller than 5, use the condition x<5 in the generator expression:
>>> lst = [10, 11, 42, 1, 2, 3]
>>> sum(x<5 for x in lst)
3
To count the number of zeros in a given list, use the list.count(0) method call.
To count the number of non-zeros in a given list, you should use conditional counting as discussed before:
def condition(x):
return x!=0
lst = [10, 11, 42, 1, 2, 0, 0, 0]
print(sum(condition(x) for x in lst))
# 5
An alternative is to use a combination of the map and the lambda function.
Related articles:
Heres the code:
>>> sum(map(lambda x: x%2==0, [1, 2, 3, 4, 5]))
2
You count the number of even integers in the list.
The result is the number of elements for which the condition evaluates to True.
Given a list of strings. How to obtain all elements that have more than x characters? In other words: how to filter a list by string length?
coders = [‘Ann’, ‘Alice’, ‘Frank’, ‘Pit’]
filtered = [x for x in coders if len(x)>3]
print(filtered)
# [‘Alice’, ‘Frank’]
The list comprehension statement [x for x in coders if len(x)>3] filters all strings that have more than three characters.
How to remove all None values from a list? For example, you have the list [‘Alice’, None, ‘Ann’, None, None, ‘Bob’] and you want the list [‘Alice’, ‘Ann’, ‘Bob’]. How do you do this?
coders = [‘Alice’, None, ‘Ann’, None, None, ‘Bob’]
filtered = [x for x in coders if x]
print(filtered)
# [‘Alice’, ‘Ann’, ‘Bob’]
In Python, each element has an associated Boolean value so you can use any Python object as a condition. The value None is associated to Boolean value False.
Problem: Say, youve got a JSON list object. You want to filter the list based on an attribute. How to accomplish that?
Example: Given the following JSON list.
json = [
“user”: “alice”,
“type”: “free”
,
“user”: “ann”,
“type”: “paid”
,
“user”: “bob”,
“type”: “paid”
]
You want to find all users that have a ‘paid’ account type:
[
“user”: “ann”,
“type”: “paid”
,
“user”: “bob”,
“type”: “paid”
]
Solution: Use list comprehension [x for x in json if x[‘type’]==’paid’] to filter the list and obtain a new json list with the objects that pass the filter.
json = [
“user”: “alice”,
“type”: “free”
,
“user”: “ann”,
“type”: “paid”
,
“user”: “bob”,
“type”: “paid”
]
filtered = [x for x in json if x[‘type’]==’paid’]
print(filtered)
# [‘user’: ‘ann’, ‘type’: ‘paid’,
# ‘user’: ‘bob’, ‘type’: ‘paid’]
Only Ann and Bob have a paid account and pass the test x[‘type’]==’paid’.
Want to filter your list by a given condition in one line of code? Use the list comprehension statement [x for x in list if condition] where the condition part can be any Boolean expression on x. This one-liner returns a new list object with all elements that pass the filtering test.
Heres an example:
lst = [‘Alice’, 3, 5, ‘Bob’, 10]
# ONE-LINER:
f = [x for x in lst if type(x)==str]
print(f)
# [‘Alice’, ‘Bob’]
The one-liner filters all elements in the list and checks whether they are of type string. If they are, they pass the test and are included in the new list.
If you like one-liners, youll love my Python One-Liner book (NoStarch Press 2020). It shows you exactly how to write Pythonic code and compress your thinking and coding to the most minimalistic form.
[Spoiler] Which is faster to filter a list: filter() vs list comprehension? For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
To answer this question, Ive written a short script that tests the runtime performance of filtering large lists of increasing sizes using the filter() and the list comprehension methods.
My thesis is that the list comprehension method should be slightly faster for larger list sizes because it leverages the efficient cPython implementation of list comprehension and doesnt need to call an extra function.
I used my notebook with an Intel(R) Core(TM) i7-8565U 1.8GHz processor (with Turbo Boost up to 4.6 GHz) and 8 GB of RAM.
Then, I created 100 lists with both methods with sizes ranging from 10,000 elements to 1,000,000 elements. As elements, I simply incremented integer numbers by one starting from 0.
Heres the code I used to measure and plot the results: which method is fasterfilter() or list comprehension?
import time
# Compare runtime of both methods
list_sizes = [i * 10000 for i in range(100)]
filter_runtimes = []
list_comp_runtimes = []
for size in list_sizes:
lst = list(range(size))
# Get time stamps
time_0 = time.time()
list(filter(lambda x: x%2, lst))
time_1 = time.time()
[x for x in lst if x%2]
time_2 = time.time()
# Calculate runtimes
filter_runtimes.append((size, time_1 – time_0))
list_comp_runtimes.append((size, time_2 – time_1))
# Plot everything
import matplotlib.pyplot as plt
import numpy as np
f_r = np.array(filter_runtimes)
l_r = np.array(list_comp_runtimes)
print(filter_runtimes)
print(list_comp_runtimes)
plt.plot(f_r[:,0], f_r[:,1], label=”filter()”)
plt.plot(l_r[:,0], l_r[:,1], label=”list comprehension”)
plt.xlabel(‘list size’)
plt.ylabel(‘runtime (seconds)’)
plt.legend()
plt.savefig(‘filter_list_comp.jpg’)
plt.show()
The code compares the runtimes of the filter() function and the list comprehension variant to filter a list. Note that the filter() function returns a filter object, so you need to convert it to a list using the list() constructor.
Heres the resulting plot that compares the runtime of the two methods. On the x axis, you can see the list size from 0 to 1,000,000 elements. On the y axis, you can see the runtime in seconds needed to execute the respective functions.
The resulting plot shows that both methods are extremely fast for a few tens of thousands of elements. In fact, they are so fast that the time() function of the time module cannot capture the elapsed time.
But as you increase the size of the lists to hundreds of thousands of elements, the list comprehension method starts to win:
For large lists with one million elements, filtering lists with list comprehension is 40% faster than the built-in filter() method.
The reason is the efficient implementation of the list comprehension statement. An interesting observation is the following though. If you dont convert the filter function to a list, you get the following result:
Suddenly the filter() function has constant runtime of close to 0 secondsno matter how many elements are in the list. Why is this happening?
The explanation is simple: the filter function returns an iterator, not a list. The iterator doesnt need to compute a single element until it is requested to compute the next() element. So, the filter() function computes the next element only if it is required to do so. Only if you convert it to a list, it must compute all values. Otherwise, it doesnt actually compute a single value beforehand.
This tutorial has shown you the ins and outs of the filter() function in Python and compared it against the list comprehension way of filtering: [x for x in list if condition]. Youve seen that the latter is not only more readable and more Pythonic, but also faster. So take the list comprehension approach to filter lists!
If you love coding and you want to do this full-time from the comfort of your own home, youre in luck:
Ive created a không lấy phí webinar that shows you how I started as a Python freelancer after my computer science studies working from home (and seeing my kids grow up) while earning a full-time income working only part-time hours.
Webinar: How to Become Six-Figure Python Freelancer?
Join 21,419 ambitious Python coders. Its fun! ??
Reply
1
0
Chia sẻ
– Một số Keyword tìm kiếm nhiều : ” Video full hướng dẫn Condition in list Python tiên tiến và phát triển nhất , Chia Sẻ Link Download Condition in list Python “.
Bạn trọn vẹn có thể để lại phản hồi nếu gặp yếu tố chưa hiểu nha.
#Condition #list #Python