pyspark.pandas.read_sql#
- pyspark.pandas.read_sql(sql, con, index_col=None, columns=None, **options)[source]#
- Read SQL query or database table into a DataFrame. - This function is a convenience wrapper around - read_sql_tableand- read_sql_query(for backward compatibility). It will delegate to the specific function depending on the provided input. A SQL query will be routed to- read_sql_query, while a database table name will be routed to- read_sql_table. Note that the delegated function might have more specific notes about their functionality not listed here.- Note - Some database might hit the issue of Spark: SPARK-27596 - Parameters
- sqlstring
- SQL query to be executed or a table name. 
- constr
- A JDBC URI could be provided as str. - Note - The URI must be JDBC URI instead of Python’s database URI. 
- index_colstring or list of strings, optional, default: None
- Column(s) to set as index(MultiIndex). 
- columnslist, default: None
- List of column names to select from SQL table (only used when reading a table). 
- optionsdict
- All other options passed directly into Spark’s JDBC data source. 
 
- Returns
- DataFrame
 
 - See also - read_sql_table
- Read SQL database table into a DataFrame. 
- read_sql_query
- Read SQL query into a DataFrame. 
 - Examples - >>> ps.read_sql('table_name', 'jdbc:postgresql:db_name') >>> ps.read_sql('SELECT * FROM table_name', 'jdbc:postgresql:db_name')