Implement the missing Series.swaplevel #1919

xinrong-meng · 2020-11-18T18:02:14Z

No description provided.

ueshin · 2020-11-18T19:09:36Z

FYI: You have to remove the API from "missing" files and add a link in the doc.

itholic · 2020-11-19T02:01:27Z

You can find the missing files in databricks/koalas/missing/series.py and doc link in docs/source/reference/series.rst :)

codecov-io · 2020-11-20T23:58:29Z

Codecov Report

Merging #1919 (ced185b) into master (7877587) will increase coverage by 0.08%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1919      +/-   ##
==========================================
+ Coverage   94.08%   94.16%   +0.08%     
==========================================
  Files          41       41              
  Lines       10019    10022       +3     
==========================================
+ Hits         9426     9437      +11     
+ Misses        593      585       -8

Impacted Files	Coverage Δ
databricks/koalas/missing/series.py	`100.00% <ø> (ø)`
databricks/koalas/series.py	`97.07% <100.00%> (+0.04%)`	⬆️
databricks/koalas/namespace.py	`84.23% <0.00%> (ø)`
databricks/koalas/indexes.py	`96.85% <0.00%> (+0.01%)`	⬆️
databricks/koalas/groupby.py	`91.42% <0.00%> (+0.98%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7877587...ced185b. Read the comment docs.

databricks/koalas/tests/test_series.py

databricks/koalas/series.py

ueshin · 2020-11-21T01:11:34Z

databricks/koalas/series.py

+        index_map = list(zip(self._internal.index_spark_column_names, self._internal.index_names))
+        index_map[i], index_map[j], = index_map[j], index_map[i]
+        index_spark_column_names, index_names = zip(*index_map)
+        internal = self._kdf._internal.copy(


self._internal instead ofself._kdf._internal ?

May I ask what's their difference at usage?

I see return self._kdf._internal.select_column(self._column_label) is for self._internal.

self._kdf._internal contains other columns from the anchor DataFrame.

E.g.,:

>>> midx = pd.MultiIndex.from_arrays([['a', 'b'], [1, 2]], names = ['word', 'number']) >>> kdf = ks.DataFrame({'a': [1, 2], 'b': ['x', 'y']}, index=midx) >>> kdf a b word number a 1 1 x b 2 2 y >>> kdf.b.swaplevel() number word 1 a 1 2 b 2 Name: a, dtype: int64

should be:

>>> kdf.b.swaplevel() number word 1 a x 2 b y Name: b, dtype: object

Thanks for explaining!

I'm a little confused by "contains". Their spark_frame seems to be identical.

>>> kdf.b._internal.spark_frame.show() +-----------------+-----------------+---+---+-----------------+ |__index_level_0__|__index_level_1__| a| b|__natural_order__| +-----------------+-----------------+---+---+-----------------+ | a| 1| 1| x| 42949672960| | b| 2| 2| y| 94489280512| +-----------------+-----------------+---+---+-----------------+ >>> kdf.b._kdf._internal.spark_frame.show() +-----------------+-----------------+---+---+-----------------+ |__index_level_0__|__index_level_1__| a| b|__natural_order__| +-----------------+-----------------+---+---+-----------------+ | a| 1| 1| x| 42949672960| | b| 2| 2| y| 94489280512| +-----------------+-----------------+---+---+-----------------+

I mean their metadata:

>>> kdf.b._internal.column_labels [('b',)] >>> kdf.b._kdf._internal.column_labels [('a',), ('b',)]

or

>>> kdf.b._internal.data_spark_columns [Column<b'b'>] >>> kdf.b._kdf._internal.data_spark_columns [Column<b'a'>, Column<b'b'>]

The column_labels or data_spark_columns in the InternalFrame for Series contains only one for the Series.

I see, that's clear to me now. Thank you!

ueshin

Otherwise, LGTM.

databricks/koalas/tests/test_series.py

databricks/koalas/series.py

Prototype

55be5e3

xinrong-meng added 2 commits November 20, 2020 15:27

copy always True

6133f92

Remove _unsupported_function

0bb0da1

xinrong-meng added 3 commits November 20, 2020 16:51

Add doctest

0fff4ad

MultiIndex > 2 levels

1237cd5

Add to doc

9dd4988

xinrong-meng marked this pull request as ready for review November 21, 2020 01:02

xinrong-meng requested review from ueshin, HyukjinKwon and itholic November 21, 2020 01:03

ueshin reviewed Nov 21, 2020

View reviewed changes

xinrong-meng added 6 commits November 21, 2020 16:22

For pandas <= 0.24

41009bc

Test case for Series with Index

a36e75c

Type annotation for bool

e89f23a

Test cases

2115a5c

Use index_level

344acc1

Use self._internal

ced185b

xinrong-meng requested a review from ueshin November 23, 2020 17:25

ueshin approved these changes Nov 23, 2020

View reviewed changes

databricks/koalas/tests/test_series.py Outdated Show resolved Hide resolved

databricks/koalas/series.py Outdated Show resolved Hide resolved

Assertions

4f1d2d8

xinrong-meng merged commit 95ec75e into databricks:master Nov 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the missing Series.swaplevel #1919

Implement the missing Series.swaplevel #1919

xinrong-meng commented Nov 18, 2020 •

edited

Loading

ueshin commented Nov 18, 2020

itholic commented Nov 19, 2020

codecov-io commented Nov 20, 2020 •

edited

Loading

ueshin Nov 21, 2020

xinrong-meng Nov 23, 2020

ueshin Nov 23, 2020

xinrong-meng Nov 23, 2020

ueshin Nov 23, 2020 •

edited

Loading

xinrong-meng Nov 24, 2020

ueshin left a comment

Implement the missing Series.swaplevel #1919

Implement the missing Series.swaplevel #1919

Conversation

xinrong-meng commented Nov 18, 2020 • edited Loading

ueshin commented Nov 18, 2020

itholic commented Nov 19, 2020

codecov-io commented Nov 20, 2020 • edited Loading

Codecov Report

ueshin Nov 21, 2020

Choose a reason for hiding this comment

xinrong-meng Nov 23, 2020

Choose a reason for hiding this comment

ueshin Nov 23, 2020

Choose a reason for hiding this comment

xinrong-meng Nov 23, 2020

Choose a reason for hiding this comment

ueshin Nov 23, 2020 • edited Loading

Choose a reason for hiding this comment

xinrong-meng Nov 24, 2020

Choose a reason for hiding this comment

ueshin left a comment

Choose a reason for hiding this comment

xinrong-meng commented Nov 18, 2020 •

edited

Loading

codecov-io commented Nov 20, 2020 •

edited

Loading

ueshin Nov 23, 2020 •

edited

Loading