If the dataset is small, dataframe.to string() or dataframe.head() or dataframe.tail() are used to access the data. The following are some options for iterating a Dataframe in Pandas:
Contents
iterrows()
iterrows() iterates through the Dataframe and returns the row index value in a Series type. You should, however, never make changes to anything you’re iterating on. This is not guaranteed to work in every circumstance.
import numpy as np
import pandas as pd
rows = ["a1", "b2", "c3"]
columns = ["col1", "col2","col3"]
values = np.random.randn(3,3)*15
#pandas dataframe
df = pd.DataFrame(values, index=rows, columns=columns)
#using iterrows() for iterating over a dataframe
for index, row in df.iterrows():
print(row)
itertuples()
itertuples() iterates through the rows of the Dataframe as namedtuples, returning an iterator object for each row. When iterating the rows in a dataframe, this method is supposed to be faster than iterrows(). If the column names are invalid Python identifiers, are repeated, or begin with an underscore while using itertuples(), they will be renamed to positional names.
import numpy as np
import pandas as pd
rows = ["a1", "b2", "c3"]
columns = ["col1", "col2","col3"]
values = np.random.randn(3,3)*15
#pandas dataframe
df = pd.DataFrame(values, index=rows, columns=columns)
#using itertuples() for iterating over a dataframe
for row in df.itertuples():
print(row)
itertuples() are 100 times faster than iterrows()