Showing posts with label psql. Show all posts
Showing posts with label psql. Show all posts

Wednesday, May 14, 2014

Setting up python

This is pretty trivial, but I just booted up a brand new computer and have been in the process of downloading things for hours.

1. Download/install python.  I use Python 2.7.

2. From powershell, run this to make sure you can run python from anywhere:
(django-jay) PS C:\Python27\> [Environment]::SetEnvironmentVariable("Path", "$env:Path;C:\Python27", "User")
3. Install PIP. Download get-pip.py, then run it from your powershell window:
(django-jay) PS C:\Python27\> python get-pip.py
4. Set the PATH environment variable to include the C:\Python27\Scripts directory
(django-jay) PS C:\Python27\> [Environment]::SetEnvironmentVariable("Path", "$env:Path;C:\Python27\Scripts", "User")
5. Quit out of powershell, and reboot it.  Now run the following command:
PS C:\Python27> pip install virtualenv
7. Download/install PostgreSQL (this gives you access to psql shell)

That's it for the stuff I need! I'll update this as it evolves.

Saturday, May 3, 2014

Copying an entire database into a django-based project using psql

This title will be a bit misleading.  I know what you are thinking...

Great this guy has a way for me to define a django model then copy in an entire database--- constraints, defaults, foreign keys and all!

Well, no, I don't have a great way to do that.  Is it possible? Probably. But I don't know how to do it.

I have some tables that I want to preserve the relationship structure. So when I add tables in my django project (models.py) then sync up to the database, I need to to disable the automatically generated primary key field, so I can insert the records into the database via the \COPY command.  Finally, I need to reinstate the serialization of the id field as a constraint.

So, first, connect to the database using the method described in a previous post.

Then create the database model in models.py. In a complex data model situation, I like to bring the tables in one at a time. That means, creating a class in models.py, then syncing to the database one table (or collection of related tables) at a time.

Let's add the States class, which is simply a listing of the United States. These will be used in conjunction with the previously created Zipcodes class:

class States(models.Model):
 abbr = models.CharField(max_length=2)
 name = models.CharField(max_length=35)
 modifieduser = models.CharField(max_length=50, default='admin')
 modifieddate = models.DateField(auto_now=True)
In the powershell window, let's sync this up to the database.
(django-jay) PS C:\Python27\home\jtarlecki\djcode\django-jay\rebates> python manage.py syncdb
Which will hopefully result in some confirmation text saying that the table has been created.

Creating tables ...
Creating table rebate_states
Installing custom SQL ...
Installing indexes ...
Installed 0 object(s) from 0 fixture(s)
Now, connect to that database with psql and drop the constraint of the id field. Basically, the idea here is to change it from a SERIAL field to a simple INTEGER field.  The de-serialization drops the field's ability to be primary key in postgreSQL.

rebates=> ALTER TABLE rebate_states ALTER COLUMN id DROP DEFAULT;
Great-- with the default dropped, now let's copy in all the data.
jaytarle_rebates=> \COPY rebate_states FROM C:\Users\Dorothy\Desktop\states.csv
WITH CSV HEADER;
Now, we change the id field back into a true primary key by setting its default as nextval sequence.  First, let's look at the syntax nicely formatted.
DROP SEQUENCE rebate_states_id_seq; 

CREATE SEQUENCE rebate_states_id_seq
MAXVALUE 2147483647;
 
ALTER TABLE rebate_states 
ALTER COLUMN id TYPE INTEGER; 

ALTER TABLE rebate_states 
ALTER COLUMN id SET DEFAULT nextval('public.rebate_states_id_seq'::text);
Now, just squish all that into one line so you can execute from the psql shell:
jaytarle_rebates=> DROP SEQUENCE rebate_states_id_seq; CREATE SEQUENCE rebate_st
ates_id_seq MAXVALUE 2147483647; ALTER TABLE rebate_states ALTER COLUMN id TYPE
INTEGER; ALTER TABLE rebate_states ALTER COLUMN id SET DEFAULT nextval('public.r
ebate_states_id_seq'::text);
From which we get the following confirmation:
DROP SEQUENCE
CREATE SEQUENCE
ALTER TABLE
ALTER TABLE 
And, voila... our records are in the database and re-synced up with our django project.

Did I copy in a whole database? Well, no. I need to do this process for every table. I know, it sucks.  Perhaps there is a better way...

Monday, April 28, 2014

Coping tables in and out using psql

I have some tables that exist in another database (which is also remote).   Instead of doing cross-database queries, which a lot of RDBMS do not allow with remote queries, let's just copy the tables out into a CSV and then copy them back in.

I know there are a million ways to do this, but I'm going to use psql because its pretty easy and fast.  If you have a better way, please feel free to clue me in.

First, boot up psql (I just type psql into windows run), then we log in using our credentials      

Server [localhost]: example.com
Database [postgres]: example_database_name
Port [5432]:
Username [postgres]: example_username
psql (9.2.3, server 8.4.20)
WARNING: psql version 9.2, server version 8.4.
         Some psql features might not work.
WARNING: Console code page (437) differs from Windows code page (1252)
         8-bit characters might not work correctly. See psql reference
         page "Notes for Windows users" for details.
SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
Type "help" for help.

example_database_name=>          
Hopefully, you know where your destination is and what tables exist inside that database.  If you need a quick refresher (like I always do), type \d for quick listing of tables

example_database_name=> \d

In this example, I'm going to copy out the zipcodes table and move it to another database. I do this by running the \COPY command and plunking the output into a .csv file in destination of my choice.

example_database_name=> \COPY (SELECT * FROM zipcodes) TO 'C:\local\path\to\destination\zipcodes.csv' WITH CSV HEADER;
That took about ten seconds... not too bad for about 100,000 records.

Now lets exit the example database

example_database_name=> \q

OK--now let's connect to a different database using psql. Fire up psql again, with your other database's credentials (use the same methodology as the first code block above)

example_database_name2=> \COPY rebate_zipcodes FROM C:\Users\Dorothy\Desktop\zipcodes.csv WITH CSV HEADER;

Obviously, in both the \COPY statements you can omit the WITH CSV HEADER statement.  However, I like to have it so I know what the hell the table looks like in case I need to investigate the .csv, which I often do.

Great! That actually took about a minute. But regardless, this is not such a bad way to handle cross database table copying.

More info on copy and other psql commands:
http://www.postgresql.org/docs/9.2/static/sql-copy.html