[NBLUG/talk] Transferring data from Oracle to an open-source
sampln at sbcglobal.net
Wed May 24 16:25:06 PDT 2006
Eric Eisenhart wrote:
> On Tue, May 23, 2006 at 09:17:13PM -0700, Lincoln Peters wrote:
>> I have a project I've been working on for the database class at SSU.
>> It's starting to look like my project will be useful well beyond this
>> class, so I'd like to transfer it from the CS department's Oracle server
>> ("kirby") to my own server. I'd rather not have to deal with the
>> licensing issues (not to mention cost) of running my own Oracle server,
>> so I'd like to migrate it to an open-source SQL database such as MySQL
>> or PostgreSQL.
> What are you going to use the DB server for? Oracle will let you run a
> development-only, single-user version for free and has had that arrangment
> for a long time.
I know, but that license agreement doesn't fit my goals.
I'm building a wildlife database, that can be used to identify plants
and animals, or even to generate field guides (in print or digital). A
few volunteers at the Fairfield Osborn Preserve are planning to survey
the plants and animals that live there over the summer, and they have
expressed interest in using my database to store their findings.
> They've also got a newer "Express Edition" thing that's
> more limited in some ways, but has much less restrictive licensing
> (redistributable in commercial products!) and is a lot easier to install.
> (restrictions are things like 4GB of data, 1 CPU used, 1GB of RAM used)
I don't know how big the database will get. I doubt it would exceed
4GB, but it's just too soon to tell.
>> What I'd like to find out is:
>> * Which open-source database is preferred? As far as I can tell, MySQL
>> is the most popular, but I've heard good things about PostgreSQL as well.
> It depends.
> Basically, PostgreSQL has fuller support for the ANSI SQL standard. MySQL's
> been catching up, especially if you use the InnoDB table format that gets
> you transactions. If you're doing complicated SQL with sub-SELECTs and
> things like that, you're gonna want PostgreSQL. Another example: PostgreSQL
> uses sequences; MySQL uses a marker on a column that sets it as
> auto-increment. (you can emulate MySQL's behavior in PostgreSQL or Oracle
> with a sequence and a trigger). Heck, for that matter, PostgreSQL can do
> triggers and I don't think MySQL does them yet.
Most of the queries involved in my database are some combination of
joins and "IN" operators, such as:
--Display all traits of the organism called 'Pinus monticola'.
SELECT tname FROM Organism o, OrganismHasTrait ot, Trait t WHERE
o.oid=ot.oid AND ot.tid=t.tid AND o.sname='Pinus monticola';
--Display all organisms that have needles.
SELECT o.oid, o.sname FROM Organism o, OrganismHasTrait ot WHERE
o.oid=ot.oid AND ot.tid IN (SELECT tid FROM Organism WHERE tname='has
needles') AND ot.truefalse='T';
--Display all traits possessed by organisms that have needles, and how
many organisms possess each trait, sorted by number of organisms in
descending order (i.e. starting with the trait with the most associated
SELECT ot.oid, t.tname, count(*) Organisms FROM OrganismHasTrait ot,
Trait t WHERE t.tid=ot.tid AND ot.tid IN (SELECT tid FROM Organism WHERE
tname='has needles') AND ot.truefalse='T' GROUP BY ot.tid, t.tname ORDER
BY Organisms DESC;
Given what you said about PostgreSQL having better support for
sub-SELECT queries (which may play a major role in future versions
involving more complex data sets), it sounds like PostgreSQL might
actually be preferable to MySQL, at least in this regard (I'd rather not
have to completely rewrite these queries so that I don't have to use
> On the other hand, having dealt with both, MySQL is a lot easier to manage
> that PostgreSQL. PostgreSQL has some weird ideas about things with trying
> to match DB accounts to user accounts.
Are you saying that every DB account has to correspond to a user
account? Actually, I think I can live with that.
> Also, PgSQL has (had? I haven't kept up) issues with CLOBs and BLOBs, and
> the DB driver doesn't support placeholders, forcing DBI to emulate them
> without the full security benefits of placeholders.
I don't use CLOBs or BLOBs in this database, so this shouldn't be an
issue. I had been thinking about having it store images (photos of
organisms, illustrations of traits, etc.), but I think I could just as
easily store them outside the database and use a VARCHAR attribute to
store the path to an image.
When you say placeholders, are you referring to bind values? For example:
my $insert_sql = 'insert into foo (a,b,c) values (?,?,?);
I was aware that placeholders (bind values) could yield some major
performance gains over interpolated statements, but I hadn't considered
the security benefits (there is no data in here that needs to be secured
against unauthorized access, so the only security issues I'd be worried
about are unauthorized inserting and deleting of data).
>> * Based on my understanding of the Perl DBI (which I've studied but have
>> not yet used), it should be a simple matter to transfer all of the
>> existing records from the Oracle database into any other RDBMS supported
>> by the Perl DBI. Assuming that I use either MySQL or PostgreSQL, will
>> this be a simple task?
> Reasonably so, yes. Especially if you're willing to write the new table
> definitions by hand and you're only trying to script the transfer of data.
> Something like this:
Looks like what I had in mind. The table definitions are surprisingly
simple, so rewriting the definitions by hand isn't a big deal. In fact,
I ran across a few deficiencies in my original design (mostly 1-to-n
relationships that should have been n-to-m), and I don't think it would
be difficult to fix them during the conversion process.
> Or even easier if you find something that does the work for you. Say this:
I don't know if I'll end up using MySQL, but this looks like something
worth remembering for the future.
> (nothing quite so easy for Pg that I could find in under 30 seconds of
> googling. http://techdocs.postgresql.org/techdocs/oracle-to-pg-porting.php
> is the best I could do...)
As far as I can tell from this page, the features that differ between
Oracle and PostgreSQL are all features that I'm not using, so it
shouldn't be a problem. The Perl DBI trick should work, then.
<sampln at sbcglobal.net>
I just remembered something about a TOAD!
More information about the talk