zmc
2023-10-12 ed135d79df12a2466b52dae1a82326941211dcc9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
U
­ý°d0‰ã@sødZddlmZddlZddlZddlmZmZm    Z    m
Z
m Z m Z m Z mZmZddlmZdd„Zdd„ZgZgZd    D]˜Zed
d ƒD]ˆ\ZZeeeeƒZde ded ed  ¡ƒdd¡fZddddgfZeeeƒD]6\ZZ e !eee eef¡e !e›de›de›¡qÒq‚qtej"j#ej"j$deedej" $dddg¡ej" $dddg¡ej" $dddg¡ej" $dddg¡ej" $dddg¡dd„ƒƒƒƒƒƒƒZ%ej" $d ddg¡d!d"„ƒZ&ej" $d#d$d%gd$d%d&gg¡d'd(„ƒZ'ej" $d#d$d%gd$d%d&gg¡d)d*„ƒZ(d+d,„Z)d-d.„Z*ej+d/d0„ƒZ,d1d2„Z-d3d4„Z.d5d6„Z/d7d8„Z0ej" $d9d:d;d<g¡ej" $dddg¡ej" $d=d>d?d    g¡ej" $d@ddg¡ej" $dAddg¡dBdC„ƒƒƒƒƒZ1ej" $dDddg¡ej" $dEddddddFdGgdddddgddFddFdgfdddGdFdddgdddddgddFdFddgfdddGddFddgdddddgddFdFddgfg¡dHdI„ƒƒZ2ej+dJdK„ƒZ3ej" $dLddddMdddgdNdOdPgfddddMdddgdNdQdPgfddddMdddgdNdRdSgfddddTdUdVdVgdNdRdSgfg¡dWdX„ƒZ4ej+dYdZ„ƒZ5ej" $d[dddddFd d\d]d^ddGg    dUdUd_dVdVdVdVd_d_g    fdddddFd ddGgdUdUd_d_d_d_gfddddd d\d]d^gdUdUdVdVdVdVgfddddd gdUdUd_gfg¡d`da„ƒZ6ej+dbdc„ƒZ7ej" $dddddge j8dedfdggdhdidjgdkfdddddge e dgƒe dldmdngƒe dodpej9gƒgddddgddddgddddggdhdidjgdqfg¡ej" $dddg¡drds„ƒƒZ:ej" $d@ddg¡ej" $dtddg¡ej" $duddMej;ddddddddddddg ej<dvfddTe ;dUdVdVdwdwdwdUdUdwdwdwdwg ¡fg¡dxdy„ƒƒƒZ=dzd{„Z>ej" $d@ddg¡ej" $duddMej;ddddddddddddg ej<dvfddTe ;dUdVdVdwdwdwdUdUdwdwdwdwg ¡fg¡d|d}„ƒƒZ?ej" $d@ddg¡ej" $duddMej;ddddddddddddddddddgej<dvfddTe ;dUdVdVdwdwdwdUdUdwdwdwdwdwdwdwdwdwdwg¡fg¡d~d„ƒƒZ@ej" $d@ddg¡ej" $d€ddd‚dƒd„d…d†d‡dˆd‰dŠd‹dŒg fdddƒd…d‡d‰gfg¡ej" $duddMej;ddddddddddddg ej<dvfddTe ;d_dwd_dwd_dwd_dwd_dwdwdwg ¡fg¡ddŽ„ƒƒƒZAej" $d@ddg¡ej" $dtddg¡ej" $duddMej;ddddddddddddg ej<dvfddTe ;dUdVdVdwdwdwdUdUdwdwdwdwg ¡fg¡dd„ƒƒƒZBej" $d‘ddMdddgfddTdUdUd_gfg¡d’d“„ƒZCej" $d”d•eDd–ƒd—dd˜d™d™dšgfd›eDdœƒdgd—dd˜d™dždgfg¡ej" $d@ddg¡dŸd „ƒƒZEej" $d¡ddg¡d¢d£„ƒZFd¤d¥„ZGd¦d§„ZHd¨d©„ZIdªd«„ZJd¬d­„ZKej" $d ddg¡d®d¯„ƒZLdS)°z­
these are systematically testing all of the args to value_counts
with different size combinations. This is to ensure stability of the sorting
and proper parameter handling
é)ÚproductN)    Ú CategoricalÚCategoricalIndexÚ    DataFrameÚGrouperÚIndexÚ
MultiIndexÚSeriesÚ
date_rangeÚ to_datetimecCsˆtdgdgdœƒ}|d d¡|d<| d¡d ¡}tddggddgd}|d d¡|d<t |¡}tdg|d    d
}t ||¡dS) NÚfemaleÚUS)ÚgenderÚcountryrÚcategoryr©ÚcolumnséÚcount©ÚindexÚname)    rÚastypeÚgroupbyÚ value_countsrÚ
from_framer    ÚtmÚassert_series_equal)ÚdfÚresultZdf_mi_expectedZ mi_expectedÚexpected©r!ú]d:\z\workplace\vscode\pyvenv\venv\Lib\site-packages\pandas/tests/groupby/test_value_counts.pyÚ.tests_value_counts_index_names_category_columnsþÿ
r#cCsØtj d¡tddd}ttj tdƒ|¡tj ||¡tj d|d|¡dœƒ}|rÔ|d d    ¡|d<tj    |j
ddd
…d f<tj    |j
d dd …df<tj    |j
ddd…df<tj    |j
ddd…df<tj    |j
ddd…df<|S)NiÒz
2015-08-24é
)ZperiodsÚabcdr)Ú1stÚ2ndÚ3rdr(Úfloaté r&éér'éééé    ) ÚnpÚrandomÚseedr
rÚchoiceÚlistÚrandintrÚnanÚloc)Ú    seed_nansÚnÚmÚdaysÚframer!r!r"Úseed_df/s    ýÿr>)TF)édiè)éér@r(rér&r'ú-zdf, keys, bins, n, m)ÚidsÚisortTFznormalize, name)TÚ
proportion)FrÚsortÚ    ascendingÚdropnac Cs¦dd„} |||    |
|dœ} |j||d} | djf| Ž}|j||d} | djtjf| Ž}|jjdd…dg|j_| |¡}t| ||fƒ\}}t     | 
¡| 
¡¡dS)NcSs2tt|jjt|jjƒƒƒ}tj||jjd|_|S)N©Únames)    r5ÚmaprZget_level_valuesÚrangeZnlevelsrÚ from_arraysrK)rZarrr!r!r"Ú rebuild_index^sz7test_series_groupby_value_counts.<locals>.rebuild_index)Ú    normalizerGrHrIÚbins©rGr(éÿÿÿÿ) rrÚapplyr    rrKÚrenamerLrrÚ
sort_index)rÚkeysrQr:r;rErPrrGrHrIrOÚkwargsZgrÚleftÚrightr!r!r"Ú test_series_groupby_value_countsTs
û
r[Úutcc    Cs¤tdddddddgddd    d    d
d
d gd œƒ d g¡}t|d|dd|d<| tddd¡}|d ¡ ¡}|d tj¡ ¡}|j    j
|j    _
|  d¡}t   ||¡dS)Né©GI]é)™J]鍙J]é©êK]é)<M]éU=M]驍N]ÚappleÚbananaÚorangeÚpear©Ú    TimestampÚFoodr+riÚs©r\ÚunitÚDatetimeÚ1D©ÚfreqÚkeyrjr)rÚdropr rrrrVrTr    rrKrUrr)r\rÚdfgrr r!r!r"Ú-test_series_groupby_value_counts_with_grouperys*ù    öÿ ó 
rurÚAÚBÚCcCsft|d}| |dd…¡}||d ¡}tg|jdd}tjggt|ƒ|d|_t     
||¡dS)NrrSr)ÚdtyperrJ) rrrr    ryrrNÚlenrrr©rrrtrr r!r!r"Ú&test_series_groupby_value_counts_empty˜s 
r|cCsPttt|ƒƒg|d}| |dd…¡}||d ¡}| ¡}t ||¡dS)N)ÚdatarrS)rrMrzrrrrr{r!r!r"Ú(test_series_groupby_value_counts_one_row¥s
r~c Cspttdgddgdƒ}| dg¡ ¡}tddgt t ddg¡tddgddgdddg¡d    d
}t     
||¡dS) NÚaÚb)Ú
categoriesrrFr)rZorderedryr©r}rr) r    rrrrrNr1Úarrayrrr)rkrr r!r!r"Ú/test_series_groupby_value_counts_on_categorical±s" ÿþÿör„c    CsÊtddddddgddddddgddddddgdœƒ}|jd    d
gd d d }|jd d }tddgddgdddggdddddgdddddgdddddggd    d
d gd}tdddddg|dd}t ||¡dS)NÚmaler ÚlowÚmediumÚhighr ÚFR©rÚ    educationrrrFrRr‹rrrB©ÚlevelsÚcodesrKrr)rrrrr    rr)rÚgbrrr r!r!r"Ú(test_series_groupby_value_counts_no_sortÌsýÿ &ýrc    Cs4tddddddgddddddgddddddgdœƒS)    Nr…r r†r‡rˆr r‰rŠ©rr!r!r!r"Ú education_dfàs ýÿr’c    Cs4|jddd}tjtdd| ¡W5QRXdS)Nrr©Úaxisr”©Úmatch)rÚpytestÚraisesÚNotImplementedErrorr©r’Úgpr!r!r"Ú    test_axisësrœc    Cs6| d¡}tjtdd|jdgdW5QRXdS)NrÚsubsetr•©r)rr—r˜Ú
ValueErrorrršr!r!r"Útest_bad_subsetñs
r cCs\| d¡ddgjdd}tdddddgtjdd    d
d d gdddgd dd}t ||¡dS)Nrrr‹T©rPçà?çÐ?©r‰r…r†©r‰r rˆ©r‰r…r‡©r r rˆ©r r…r†rJrFr‚)rrr    rÚ from_tuplesrr)r’rr r!r!r"Ú
test_basic÷s"ÿ ûø
ôrªcCs||j|||dS)N©rPrGrH)r)rrWrPrGrHr!r!r"Ú_frame_value_counts sr¬rÚcolumnrƒÚfunctionzsort, ascending)FN)TTÚas_indexr=c sdˆdj‡fdd„dœ|}ˆj||d}    |    ddgj|||d}
|r|     tddg|||¡} |rrt |
| ¡n|rzd    nd
}|  ¡jd |id d } |dkrÆ| jddid d } t     
| ddd¡| d<n0|dkrà| dd k| d<nt     
| ddd¡| d<t  |
| ¡nˆddˆdˆd<|    dj|||d} || _ |r¶| j jdd} | dj d¡j d ¡| d<| dj d¡j d ¡| d<| d=| jd did d } t | ¡| _ t |
| ¡nV|  d d| dj d¡j d ¡¡|  dd| dj d¡j d ¡¡| d=t  |
| ¡dS)Nrcsˆd|dkS)Nrr r!)Úx©r’r!r"Ú<lambda>+óz6test_against_frame_and_seriesgroupby.<locals>.<lambda>)r­rƒr®)Úbyr¯rr‹r«rFrrrr“r­Úlevel_0r r‰r®rCZbothF©rrB)ÚvaluesrrrTr¬rrÚ reset_indexrUr1ÚwhereÚassert_frame_equalrrZto_frameÚstrÚsplitÚgetrrÚinsert) r’rrPrrGrHr¯r=r´r›rr Z index_framer!r±r"Ú$test_against_frame_and_seriesgroupbysd
ýü ÿÿ ÿ ""r¿rPzCsort, ascending, expected_rows, expected_count, expected_group_sizer+éc
s†ˆjddgddd}|dj|||d}tƒ}    dD]‰‡‡fdd    „|Dƒ|    ˆ<q2|rn||    d
<|    d
|<n||    d <t ||    ¡dS) NrrF)r¯rGr‹r«)rrr‹csg|]}ˆˆ|‘qSr!r!©Ú.0Úrow©r­r’r!r"Ú
<listcomp>tsz!test_compound.<locals>.<listcomp>rFr)rrrrrº)
r’rPrGrHÚ expected_rowsZexpected_countZexpected_group_sizer›rr r!rÄr"Ú test_compound[sÿrÇcCs4tddddgddddgddddgdœddd    d
gd S) NrrBrÀér)rrÚnum_legsÚ    num_wingsZfalconZdogÚcatZantr¶r‘r!r!r!r"Ú
animals_df}s"
þrÌz?sort, ascending, normalize, name, expected_data, expected_indexr)rrr)rBrÀrÈ)rBrr)rBrÈrÀ)rÀrBrÈ)rrBrrFr¢r£c
Cs`|j|||d}t|tj|dddgd|d}t ||¡| d¡j|||d}    t |    |¡dS)N)rGrHrPrrrÉrÊrJr‚)rr    rrNrrr)
rÌrGrHrPrÚ expected_dataÚexpected_indexÚ result_framer Úresult_frame_groupbyr!r!r"Útest_data_frame_value_counts…s(ÿÿû 
ÿrÑc Cs`tj}tdd|d|ddddg    ddd||ddddg    dddddd|d|g    ddddddd||g    d    œƒS)
NrrÀrÈr+rBr@r/r-)rvrwrxÚD)r1r7r)r:r!r!r"Únulls_df­süÿrÓz:group_dropna, count_dropna, expected_rows, expected_valuesr-rÈr/gð?c
stˆjddg|d}|jdd|d}tƒ}ˆjD]‰‡‡fdd„|Dƒ|ˆ<q.t |¡}t||dd    }    t ||    ¡dS)
Nrvrw)rIT)rPrGrIcsg|]}ˆˆ|‘qSr!r!rÁ©r­rÓr!r"rÅÏsz,test_dropna_combinations.<locals>.<listcomp>rFr‚)    rrrrrrr    rr)
rÓZ group_dropnaZ count_dropnarÆÚexpected_valuesr›rrrr r!rÔr"Útest_dropna_combinationsºs
 
rÖcCs(tddddgddddgd||dgdœƒS)NrÚJohnÚAnneÚBethÚSmithÚLouise)rrÚ
first_nameÚ middle_namer‘)Z nulls_fixturer!r!r"Únames_with_nulls_dfÕs 
 
 
ýÿrÞz%dropna, expected_data, expected_index©rr)rÙr×)rÛrÚrrrÜrÝrJrØrÙr×rÛrÚrŒc    Cs`|j||d}t|||d}|r0|tt|ƒƒ}t ||¡| d¡j||d}t ||¡dS)N)rIrPr‚rr)rr    r)rzrrr)    rÞrIrPrrÍrÎrÏr rÐr!r!r"Ú#test_data_frame_value_counts_dropnaàs!ý 
ÿràÚobservedznormalize, name, expected_data©ryçc Cs¾| d¡jd||d}|j|d}tjddddd    d
d d d dddg dddgd}t|||d}    tdƒD]"}
|    jjt    |    jj
|
ƒ|
d|    _qd|ršt   ||    ¡n |    j |r¦dndd} t  || ¡dS)Nrr©r¯rár¡r¤r¥r¦©r‰r r†©r‰r r‡©r‰r…rˆr§r¨©r r r†©r r r‡©r r…rˆ©r r…r‡rr‹rJr‚r+©ÚlevelrFr©r)rrrrr©r    rMrÚ
set_levelsrrrrr¸rº© r’r¯rárPrrÍr›rrÎÚexpected_seriesÚir r!r!r"Ú=test_categorical_single_grouper_with_only_observed_categoriessL
ÿ ôñý ÿ
 
ÿróc CsÖ| ¡ d¡}|dj dg¡|d<|jd||d}|j|d}t|tj|dddgd|d    }    t    d
ƒD]@}
t
|    j j |
ƒ} |
d kr”|   |djj¡} |    j j| |
d |    _ qf|rºt ||    ¡n|    j|d } t || ¡dS)NrrÚASIArär¡rr‹rJr‚r+rrìrî)ÚcopyrrËZadd_categoriesrrr    rr©rMrrrZset_categoriesrrïrrr¸rº) r’r¯rárÎrPrrÍr›rrñròZ index_levelr r!r!r"Ú!assert_categorical_single_grouperTs. þú 
ÿ röc Cs6ddddddddd    d
d d g }t||d ||||ddS)Nr¤r¥r¦rårærçr§r¨rèrérêrëT©r’r¯rárÎrPrrÍ©rö©r’r¯rPrrÍrÎr!r!r"Ú-test_categorical_single_grouper_observed_truews,ôùrúcCsBddddddddd    d
d d d dddddg}t||d||||ddS)Nr¤r¥r¦rårçrær§r¨rërêrérè)rôr…r†)rôr…rˆ)rôr r‡)rôr r†)rôr rˆ)rôr…r‡Fr÷rørùr!r!r"Ú.test_categorical_single_grouper_observed_false¦s8-îùrûzobserved, expected_index)r‰rˆr )r‰rˆr…)r‰r†r…)r‰r†r )r‰r‡r…)r‰r‡r )r rˆr )r rˆr…)r r†r…)r r†r )r r‡r )r r‡r…c CsÜ| ¡}|d d¡|d<|d d¡|d<|jddg||d}|j|d}t|r^||dkn|tj|dddgd|d    }    td
ƒD]"}
|    jj    t
|    jj |
ƒ|
d |    _q‚|r¸t   ||    ¡n |    j|rÄd nd d} t  || ¡dS)Nrrr‹rär¡rãrrJr‚rBrìrFrrî©rõrrrr    rr©rMrrïrrrrr¸rº) r’r¯rárÎrPrrÍr›rrñròr r!r!r"Ú"test_categorical_multiple_groupersòs87ÿ þú ÿ
 
ÿrýc Csæ| ¡}|d d¡|d<|d d¡|d<|jd||d}|j|d}ddd    d
d d d dddddg }t|tj|dddgd|d}    tddƒD]"}
|    jj    t
|    jj |
ƒ|
d|    _qŒ|rÂt   ||    ¡n |    j|rÎdndd} t  || ¡dS)Nrrr‹rrär¡r¤r¥r¦rårærçr§r¨rèrérêrërJr‚rr+rìrFrrîrürðr!r!r"Útest_categorical_non_groupersHsJ ôþúÿ
 
ÿrþz*normalize, expected_label, expected_valuesc Cs–tdddgdddgdœƒ}|jdddgddd    „gd
d }|jd |d }tdtjdddgtjdddddgddddgddddg||iƒ}t ||¡dS)NrrBr+)rvrwrÀr@rvcSs|dkr dSdS)Nrr-r/r!)ròr!r!r"r²”r³z&test_mixed_groupings.<locals>.<lambda>F©r¯T)rGrPrµrâZlevel_2r/r-rw)rrrr1rƒÚint_rrº)rPÚexpected_labelrÕrr›rr r!r!r"Útest_mixed_groupingsŠs"    ûÿ    rztest, columns, expected_namesÚrepeatZabbderÚdr€Úerír%Úlevel_1Úcc
CsÆtdddddgdddd    d
gg|d }d d g}dtjddgtjddg}|j||d ¡}|r‚tdtj||ddd}t     
||¡n@dd„|Dƒ}t |ƒ}    d|    d<|      d¡t||    d }t      ||¡dS)Nrr+r@r-r0rBrÀrÈr/r$r)rrr-r+r@r0)rBrr/rÀrÈr$rrrârrÿrßrJrr‚cSsg|]}t|ƒdg‘qS)r)r5rÁr!r!r"rższ0test_column_label_duplicates.<locals>.<listcomp>r)rr1rƒÚint64rrr    rr©rrr5Úappendrº)
ÚtestrZexpected_namesr¯rrÍrWrr Zexpected_columnsr!r!r"Útest_column_label_duplicates¢s( $þú
 r znormalize, expected_labelc    CsZtdddggdd|gdjddd}d    |›d
}tjt|d |j|d W5QRXdS) NrrBr+rr€rFrÿzColumn label 'z' is duplicate of result columnr•r¡)rrr—r˜rŸr)rPrrÚmsgr!r!r"Útest_result_label_duplicatesÄs    ÿ r cCsftdddgiƒ}| tjddgtjd¡}| ¡}tdgtjddggddgddd}t     
||¡dS)NrrrârBrJrr) rrr1rƒrrr    rr©rr)rrrr r!r!r"Útest_ambiguous_groupingÕsÿrc    CsZtdddgdddgdœdddgd    }d
}tjt|d | d ¡jd gd W5QRXdS)Nrr€rr°Úy©Úc1Úc2rrr¶z;Keys {'c1'} in subset cannot be in the groupby column keys.r•rrž©rr—r˜rŸrr©rr r!r!r"Ú"test_subset_overlaps_gb_key_raisesàs$rc    CsZtdddgdddgdœdddgd    }d
}tjt|d | d ¡jd gdW5QRXdS)Nrr€rr°rrrrr¶z4Keys {'c3'} in subset do not exist in the DataFrame.r•rÚc3ržrrr!r!r"Ú!test_subset_doesnt_exist_in_frameès$rcCsvtdddgdddgdœdddgd    }|jdd
jd gd }tdd gtjddgddggdd gddd}t ||¡dS)Nrr€rr°rrrrr¶rìrržrBrJrr©rrrr    rrNrr©rrr r!r!r"Ú test_subsetðs$ýrcCsŒtdddgdddgdddggdddgdddgd    }|jdd
jdgd }tdd gtjddgddgddggdddgd dd}t ||¡dS)Nrr°r€rrrrr)rrrìržrBrJrrrrr!r!r"Útest_subset_duplicate_columnsüsýÿûrc
Csätdddddddgddd    d    d
d
d gd œƒ d g¡}t|d|dd|d<| tddd¡}| ¡}tddddg|d}|d ¡}t||dd    d
d ggdddddd gtdƒdddddd ggdddgd}t    d|dd }t
  ||¡dS)!Nr]r^r_r`rarbrcrdrerfrgrhr+rirkrlrnrorpz
2019-08-06z
2019-08-07z
2019-08-09z
2019-08-10)r\rrrBrÈrjrŒrr) rrsr rrrÚuniquerrMr    rr)r\rrrÚdatesZ
timestampsrr r!r!r"Útest_value_counts_time_groupers:ù    öÿ ó
ÿ $ýr)MÚ__doc__Ú    itertoolsrÚnumpyr1r—Zpandasrrrrrrr    r
r Zpandas._testingZ_testingrr#r>ZbinnedrDr9r:r;rZarangeÚmaxrQrWÚkr€r    ÚmarkZslowZ parametrizer[rur|r~r„rZfixturer’rœr rªr¬r¿rÇrÌrÑrÓrÖrÞrNr7ràrƒrrórörúrûrýrþrr5r r rrrrrrr!r!r!r"Ú<module>s ,  $ 
 
 
 
 
ýþ>***ýþ 
úüþ
 
ü""÷þ
 
 
þý    
 ý ùý÷þ &ý ýúþ0#&ý ýúþ &ÿýîÿýøþ& %ôþûþîþ &ý üúþ%&ý üúþ0þþ
þþ þþ