Skip to content Skip to sidebar Skip to footer

How To Write The Json File In S3 Parquet

import json import requests import datetime import boto3 import parquet import pyarrow import pandas as pd from pandas import DataFrame

Solution 1:

Make sure your s3_object is an s3 url string. It has to look something like this

"s3://my_bucket/path/to/data_folder/my-file.parquet"

Besides this, it's not recommended to use pandas for writing a dataframe as parquet to S3. For python 3.6+ AWS has a library called aws-data-wrangler that helps with the integration between Pandas/S3/Parquet

to install do;

pip install awswrangler

to write your df to s3, do;

import awswrangler as wr
wr.s3.to_parquet(df=df, path="s3://my_bucket/path/to/data_folder/my-file.parquet")

Post a Comment for "How To Write The Json File In S3 Parquet"