Skip to content
Home » How to iterate over rows in a Pandas DataFrame

How to iterate over rows in a Pandas DataFrame

If the dataset is small, dataframe.to string() or dataframe.head() or dataframe.tail() are used to access the data. The following are some options for iterating a Dataframe in Pandas:

iterrows()

iterrows() iterates through the Dataframe and returns the row index value in a Series type. You should, however, never make changes to anything you’re iterating on. This is not guaranteed to work in every circumstance.

import numpy as np
import pandas as pd

rows = ["a1", "b2", "c3"]
columns = ["col1", "col2","col3"]
values = np.random.randn(3,3)*15

#pandas dataframe
df = pd.DataFrame(values, index=rows, columns=columns)

#using iterrows() for iterating over a dataframe  
for index, row in df.iterrows():

    print(row)

itertuples()

itertuples() iterates through the rows of the Dataframe as namedtuples, returning an iterator object for each row. When iterating the rows in a dataframe, this method is supposed to be faster than iterrows(). If the column names are invalid Python identifiers, are repeated, or begin with an underscore while using itertuples(), they will be renamed to positional names.

import numpy as np
import pandas as pd

rows = ["a1", "b2", "c3"]
columns = ["col1", "col2","col3"]
values = np.random.randn(3,3)*15

#pandas dataframe
df = pd.DataFrame(values, index=rows, columns=columns)

#using itertuples() for iterating over a dataframe  
for row in df.itertuples():
    print(row)

itertuples() are 100 times faster than iterrows()