fix: 设置 PySpark 的 Python环境变量

- 添加 PYSPARK_PYTHON 和 PYSPARK_DRIVER_PYTHON 环境变量设置
- 指定 Python 3.6 作为 PySpark 的 Python 版本
This commit is contained in:
fly6516 2025-04-22 13:39:30 +08:00
parent 2018debf80
commit 73ae9b135b

View File

@ -3,6 +3,8 @@ import os
# Set Java environment variable
os.environ['JAVA_HOME'] = '/opt/module/jdk1.8.0_171'
os.environ["PYSPARK_PYTHON"]="/usr/local/bin/python3.6"
os.environ["PYSPARK_DRIVER_PYTHON"]="/usr/local/bin/python3.6"
# Parse ratings data into (userID, movieID, rating)
def get_ratings_tuple(entry):