13.3 SQL Data Set Implementation

This data set implementation allows access to the popular MySQL open source database system. It uses the Python module MySQLdb which has to be installed separately13.2.

Note: Support for a wider range of SQL databases will be added to Febrl in a future release. More information on the Python database API can be found at http://www.python.org/topics/database/, including a list of Python modules for various (commercial and open source) databases.

The SQL data set implementation currently allows sequential access only.

The fields attribute of an SQL data set must be a dictionary where the keys are the field names and the values are the names of the corresponding table columns in the SQL database.

If an SQL data set is initialised in write or append mode it is assumed that the table is already created and contains the necessary columns. The main reason for this is when creating an SQL table it is necessary to give the types and length of all table columns, information which are not available when a data set is initialised.

Additional attributes (besides the general data set attributes as described above) for a SQL data set are

The following example shows how to initialise a SQL data set and how to access it in read mode. It is assumed that the dataset.py module has been imported using the import dataset command.

# ====================================================================

mydata = dataset.DataSetSQL(name = 'hospital-data',
                     description = 'Hospital data from 1990-2000',
                    access_right = 'read',
                   database_name = 'hospital-data',
                   database_user = 'exampleuser',
                      table_name = 'hospital',
                          fields = {'year':'year',
                                    'surname':'sname',
                                    'givenname':'gname',
                                    'dob':'dateofbirth',
                                    'address':'wayfare',
                                    'postcode':'pcode',
                                    'state':'territory'},
                  fields_default = '',
                    strip_fields = True,
                  missing_values = ['','missing'])

print mydata.num_records  # Print total number for records

first_record = mydata.read_record()  # Returns one record

hundred_records = mydata.read_records(1000,100)  # Read 100 records
ten_records = mydata.read_records(2000,10)  # Read another 10 records

mydata.finalise()  # Close file, finalise access to data set

Note: In its current implementation, an SQL data set can only consist of one underlying database table. The handling of multiple tables as one data set will be implemented in a future version of Febrl.



Footnotes

... separately13.2
 Available for download from http://sourceforge.net/projects/mysql-python