Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
moomindani authored Jul 26, 2023
1 parent 64c5163 commit 9d82939
Showing 1 changed file with 10 additions and 12 deletions.
22 changes: 10 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# aws-glue-libs

This repository supports python libraries for local development of glue pyspark batch jobs. Glue streaming is not supported with this library.
This repository supports python libraries for local development of glue pyspark batch jobs. Glue streaming is supported in the separate repository [aws-glue-streaming-libs](https://github.com/awslabs/aws-glue-streaming-libs).

## Contents
This repository contains:
Expand All @@ -11,13 +11,11 @@ This repository contains:

Different Glue versions support different Python versions. The following table below is for your reference, which also includes the associated repository's branch for each glue version.

| Glue Version | Python 2 Version | Python 3 Version | aws-glue-libs branch |
|---|---|---|----------------------|
| 0.9 | 2.7 | Not supported | glue-0.9 |
| 1.0 | 2.7 | 3.6 | glue-1.0 |
| 2.0 | Not supported | 3.7 | glue-2.0 |
| 3.0 | Not supported | 3.7 | glue-3.0 |
| 4.0 | Not supported | 3.10 | master |
| Glue Version | Python 3 Version | aws-glue-libs branch |
|---|---|----------------------|
| 2.0 | 3.7 | glue-2.0 |
| 3.0 | 3.7 | glue-3.0 |
| 4.0 | 3.10 | master |

You may refer to AWS Glue's official [release notes](https://docs.aws.amazon.com/glue/latest/dg/release-notes.html) for more information

Expand All @@ -30,15 +28,11 @@ The `awsglue` library provides only the Python interface to the Glue Spark runti
1. install Apache Maven from the following location: https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-common/apache-maven-3.6.0-bin.tar.gz
1. use `copy-dependencies` target in Apache Maven to download the jar from S3 to your local dev environment.
1. download and extract the Apache Spark distribution based on the Glue version you're using:
* Glue version 0.9: `https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-0.9/spark-2.2.1-bin-hadoop2.7.tgz`
* Glue version 1.0: `https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-1.0/spark-2.4.3-bin-hadoop2.8.tgz1`
* Glue version 2.0: `https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-2.0/spark-2.4.3-bin-hadoop2.8.tgz1`
* Glue version 3.0: `https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-3.0/spark-3.1.1-amzn-0-bin-3.2.1-amzn-3.tgz`
* Glue version 4.0: `https://aws-glue-etl-artifacts.s3.amazonaws.com/glue-4.0/spark-3.3.0-amzn-1-bin-3.3.3-amzn-0.tgz`
1. export the `SPARK_HOME` environmental variable to the extracted location of the above Spark distribution. For example:
```
Glue version 0.9: export SPARK_HOME=/home/$USER/spark-2.2.1-bin-hadoop2.7
Glue version 1.0: export SPARK_HOME=/home/$USER/spark-2.4.3-bin-hadoop2.8
Glue version 2.0: export SPARK_HOME=/home/$USER/spark-2.4.3-bin-hadoop2.8
Glue version 3.0: export SPARK_HOME=/home/$USER/spark-3.1.1-amzn-0-bin-3.2.1-amzn-3
Glue version 4.0: export SPARK_HOME=/home/$USER/spark-3.3.0-amzn-1-bin-3.3.3-amzn-0
Expand All @@ -59,6 +53,10 @@ The libraries in this repository licensed under the [Amazon Software License](ht
# Release Notes
## July 26 2023
* According to [AWS Glue version support policy](https://docs.aws.amazon.com/glue/latest/dg/glue-version-support-policy.html), branches for Glue 0.9 and 1.0 are removed as they are already deprecated.
## August 27 2021
* The master branch has been modified from representing Glue 0.9 to Glue 3.0, we have also created a glue-0.9 branch to reflect the former state of the master branch with Glue 0.9. To rename your local clone of the older master branch and point to the glue-0.9 branch, you may use the following commands:
```
Expand Down

0 comments on commit 9d82939

Please sign in to comment.