python pandas.DataFrame.loc函数使用详解

时间:2021-05-22

官方函数

DataFrame.loc
Access a group of rows and columns by label(s) or a boolean array.
.loc[] is primarily label based, but may also be used with a boolean array.
# 可以使用label值,但是也可以使用布尔值

  • Allowed inputs are: # 可以接受单个的label,多个label的列表,多个label的切片
  • A single label, e.g. 5 or ‘a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index). #这里的5不是数值指定的位置,而是label值
  • A list or array of labels, e.g. [‘a', ‘b', ‘c'].

slice object with labels, e.g. ‘a':'f'.

Warning: #如果使用多个label的切片,那么切片的起始位置都是包含的

Note that contrary to usual python slices, both the start and the stop are included

  • A boolean array of the same length as the axis being sliced, e.g. [True, False, True].

实例详解

一、选择数值

1、生成df

df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],... index=['cobra', 'viper', 'sidewinder'],... columns=['max_speed', 'shield'])dfOut[15]: max_speed shieldcobra 1 2viper 4 5sidewinder 7 8

2、Single label. 单个 row_label 返回的Series

df.loc['viper']Out[17]: max_speed 4shield 5Name: viper, dtype: int64

2、List of labels. 列表 row_label 返回的DataFrame

df.loc[['cobra','viper']]Out[20]: max_speed shieldcobra 1 2viper 4 5

3、Single label for row and column 同时选定行和列

df.loc['cobra', 'shield']Out[24]: 2

4、Slice with labels for row and single label for column. As mentioned above, note that both the start and stop of the slice are included. 同时选定多个行和单个列,注意的是通过列表选定多个row label 时,首位均是选定的。

df.loc['cobra':'viper', 'max_speed']Out[25]: cobra 1viper 4Name: max_speed, dtype: int64

5、Boolean list with the same length as the row axis 布尔列表选择row label
布尔值列表是根据某个位置的True or False 来选定,如果某个位置的布尔值是True,则选定该row

dfOut[30]: max_speed shieldcobra 1 2viper 4 5sidewinder 7 8df.loc[[True]]Out[31]: max_speed shieldcobra 1 2df.loc[[True,False]]Out[32]: max_speed shieldcobra 1 2df.loc[[True,False,True]]Out[33]: max_speed shieldcobra 1 2sidewinder 7 8

6、Conditional that returns a boolean Series 条件布尔值

df.loc[df['shield'] > 6]Out[34]: max_speed shieldsidewinder 7 8

7、Conditional that returns a boolean Series with column labels specified 条件布尔值和具体某列的数据

df.loc[df['shield'] > 6, ['max_speed']]Out[35]: max_speedsidewinder 7

8、Callable that returns a boolean Series 通过函数得到布尔结果选定数据

dfOut[37]: max_speed shieldcobra 1 2viper 4 5sidewinder 7 8df.loc[lambda df: df['shield'] == 8]Out[38]: max_speed shieldsidewinder 7 8

二、赋值

1、Set value for all items matching the list of labels 根据某列表选定的row 及某列 column 赋值

df.loc[['viper', 'sidewinder'], ['shield']] = 50dfOut[43]: max_speed shieldcobra 1 2viper 4 50sidewinder 7 50

2、Set value for an entire row 将某行row的数据全部赋值

df.loc['cobra'] =10dfOut[48]: max_speed shieldcobra 10 10viper 4 50sidewinder 7 50

3、Set value for an entire column 将某列的数据完全赋值

df.loc[:, 'max_speed'] = 30dfOut[50]: max_speed shieldcobra 30 10viper 30 50sidewinder 30 50

4、Set value for rows matching callable condition 条件选定rows赋值

df.loc[df['shield'] > 35] = 0dfOut[52]: max_speed shieldcobra 30 10viper 0 0sidewinder 0 0

三、行索引是数值

df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],... index=[7, 8, 9], columns=['max_speed', 'shield'])dfOut[54]: max_speed shield7 1 28 4 59 7 8

通过 行 rows的切片的方式取多个:

df.loc[7:9]Out[55]: max_speed shield7 1 28 4 59 7 8

四、多维索引

1、生成多维索引

tuples = [... ('cobra', 'mark i'), ('cobra', 'mark ii'),... ('sidewinder', 'mark i'), ('sidewinder', 'mark ii'),... ('viper', 'mark ii'), ('viper', 'mark iii')... ]index = pd.MultiIndex.from_tuples(tuples)values = [[12, 2], [0, 4], [10, 20],... [1, 4], [7, 1], [16, 36]]df = pd.DataFrame(values, columns=['max_speed', 'shield'], index=index)dfOut[57]: max_speed shieldcobra mark i 12 2 mark ii 0 4sidewinder mark i 10 20 mark ii 1 4viper mark ii 7 1 mark iii 16 36

2、Single label. 传入的就是最外层的row label,返回DataFrame

df.loc['cobra']Out[58]: max_speed shieldmark i 12 2mark ii 0 4

3、Single index tuple.传入的是索引元组,返回Series

df.loc[('cobra', 'mark ii')]Out[59]: max_speed 0shield 4Name: (cobra, mark ii), dtype: int64

4、Single label for row and column.如果传入的是row和column,和传入tuple是类似的,返回Series

df.loc['cobra', 'mark i']Out[60]: max_speed 12shield 2Name: (cobra, mark i), dtype: int64

5、Single tuple. Note using [[ ]] returns a DataFrame.传入一个数组,返回一个DataFrame

df.loc[[('cobra', 'mark ii')]]Out[61]: max_speed shieldcobra mark ii 0 4

6、Single tuple for the index with a single label for the column 获取某个colum的某row的数据,需要左边传入多维索引的tuple,然后再传入column

df.loc[('cobra', 'mark i'), 'shield']Out[62]: 2

7、传入多维索引和单个索引的切片:

df.loc[('cobra', 'mark i'):'viper']Out[63]: max_speed shieldcobra mark i 12 2 mark ii 0 4sidewinder mark i 10 20 mark ii 1 4viper mark ii 7 1 mark iii 16 36df.loc[('cobra', 'mark i'):'sidewinder']Out[64]: max_speed shieldcobra mark i 12 2 mark ii 0 4sidewinder mark i 10 20 mark ii 1 4df.loc[('cobra', 'mark i'):('sidewinder','mark i')]Out[65]: max_speed shieldcobra mark i 12 2 mark ii 0 4sidewinder mark i 10 20

到此这篇关于python pandas.DataFrame.loc函数使用详解的文章就介绍到这了,更多相关pandas.DataFrame.loc函数内容请搜索以前的文章或继续浏览下面的相关文章希望大家以后多多支持!

声明:本页内容来源网络,仅供用户参考;我单位不保证亦不表示资料全面及准确无误,也不保证亦不表示这些资料为最新信息,如因任何原因,本网内容或者用户因倚赖本网内容造成任何损失或损害,我单位将不会负任何法律责任。如涉及版权问题,请提交至online#300.cn邮箱联系删除。

相关文章