Skip to content

Commit

Permalink
feat: adding remove_duplicates transformation
Browse files Browse the repository at this point in the history
  • Loading branch information
cristian-rincon committed Nov 22, 2023
1 parent 2e5fa4c commit 732fc25
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 0 deletions.
1 change: 1 addition & 0 deletions extractor/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ def extract_data(source_path: Path, format: str) -> None:
output = pd.DataFrame(pkgs_raw_metadata)
output["uppercased_name"] = output["name"].str.upper()
output = output.sort_values(by=["uppercased_name"])
output = output.drop_duplicates(subset=["uppercased_name"], keep="first")
del output["uppercased_name"]
return output

Expand Down
1 change: 1 addition & 0 deletions tests/mocks/in/pip_freeze/sample_2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ setuptools==68.0.0
wheel==0.40.0
dbx==0.8.10
typing_extensions==4.5.0
pandas
pytest==7.3.1
pytest-cov==4.1.0
pre-commit==3.3.2
Expand Down

0 comments on commit 732fc25

Please sign in to comment.