Adjacent records grouping with the original order
Task:List the team information with the most consecutive NBA titles.
Python
| 1 | import pandas as pd |
| 2 | import numpy as np |
| 3 | pd.set_option('display.max_columns', None) |
| 4 | nba_file = 'E:\\txt\\nba.txt' |
| 5 | nba_champion = pd.read_csv(nba_file,sep='\t') |
| 6 | nba_champion = nba_champion.sort_values(by = 'Year') |
| 7 | arr = np.zeros(len(nba_champion)) |
| 8 | arr[nba_champion['Champion']!=nba_champion['Champion'].shift(1)]=1 |
| 9 | arr = np.cumsum(arr) |
| 10 | nba_champion['flag']=arr |
| 11 | nba_champion_g = nba_champion.groupby(by='flag') |
| 12 | max_num = nba_champion_g.size().idxmax() |
| 13 | max_champion = nba_champion_g.get_group(max_num) |
| 14 | print(max_champion) |
Python doesn't have the ability to group by adjacent conditions. You need to create a list of grouping flags.
esProc
| A | ||
| 1 | =connect("mysql") | Connect to database |
| 2 | =A1.query("select * from nba order by year") | Sort by year |
| 3 | =A2.group@o(Champion) | When adjacency is different, start a new group |
| 4 | =A3.maxp(~.len()) | List the team information with the most consecutive NBA titles. |
The set of esProc is ordered, and it is very convenient to merge the adjacent same records into one group and start a new group when the adjacent record is different.