Sunday, March 20, 2005

more study

In a relation, R, FD X->A holds if any two tuples with same attribute set as X have attribute A also. X->A and X->B can be combined to get X->AB.
key is uniquely identify all the other attributes in R and it's minimal.
ER key doesn't have to be minimal but for Relation key, it has to be.
ER key is property of entities while Relation key is property of tuples.
FD's come from keyness of ER, many one relationship or from common sense.
If X is key for E1 and Y is key for E2, X->Y holds when E1 and E2 have many one relationship from E1 to E2. Drinker's name gives you favorite Beer.
Update anomaly, and deletion anomaly can be avoided with normalization.
1NF = all domains are atomic values. A|1,2,3 becomes A|1, A|2, A|3
2NF = key->non-key. but partial-key !-> non-key.
AB is a key. AB -> C is ok. D->C is ok. But, A -> C is not ok.
3NF = key->non-key. non-key!->non-key.
AB->C is ok. D->C is not ok. A->C is not ok.
BCNF = key->non-key. key->partial-key. a key member attribute cannot depend on another key member attribute. superkey -> any-attribute.
AB->C is ok. AB->D is ok. But, A->B is not ok.
bottom-up: project relations. top-down: start with good ER model.
MVD = AB->C means (AB)1 C1, (AB)2 C2, (AB)1 C2, (AB)2 C1 are all in R.
4NF decomposition: X->->Y becomes XY and X union (R-Y)
select, project, cartesian product, union, set difference, set intersection, theta join, natural join, division.
Union, intersection, difference works on relations with same relation schemae.
selection: picks rows.
projection: picks columns.
products and joins: new relation results in.
renaming a relation or attributes is also possible.
SIGMAcondition(Relation) returns array of rows in the parameter Relation.
PIlistofattributes(Relation) returns a relation with those attributes only. eliminates duplicates. order of attributes preserved.
RelationXRelation returns a relation.
Theta join is same as SIGMAcondition(R1XR2).
Natural join considers same attribute name the same.
Unary(select project rename) > products joins > intersection > union difference.


Saturday, March 19, 2005

database midterm

let's study for the midterm:
Database Management System(DBMS) is not database.
Database is an integrated collection of large data shared by mass.
Database needs to be efficient in accessing, modifying, storing..etc large amount of data.
Database supports concurrent access to the data it stores.
Features of database: Atomicity - like neurons...transactions are binary, completely done or doesn't even start.
Isolation - for user's point of view, transactions appear one at a time.
Durability - recover from failure.
DBMS, or database system, is a software package that has all the features above plus organized frontend (query language...etc), query processor, transaction manager, and storage.
Database Vs. Simple file system: data is not coordinated, scatered. integrity is lost.
requirement analysis (results in DB requirements) -> conceptual design (conceptual schema is made) -> logical design (logical schema constructed) -> physical design (internal schema created)
conceptual schema is about ER model. logical schema can be implemented on a DBMS and it is about relational model.
Relation: row is tuple, column is attribute, key can identify a row in a relation.
An entity takes up one relation. Entity key is the relation key. each instance is a tuple.
each relationship is a relation with keys from connected entities.
Data Definition Language, Data Manipulation Language.
A relationship(diamond) has at least two lines coming out of it.
Unary relationship is connected to a one entity set with different roles.
Arrowed entity in multiway relationship is determined by all the entities connected to the relationship. Converting from multiway to binary relationship uses new entity set that collects all the entities connected to multiway relationship.
In conceptual design phase, we're converting DB requirements (english) to ER model. not algorithmic.
Entity has at least two attributes, or you can convert the entity to an attribute. ISA = subclass notation. no multiple inheritance. subclass's attributes are inherited to superclass entity. a subclass of multiple entities can be a problem. need ad hoc device to resolve conflicts. root entity has a key which should be the key for subclasses. ex, beer name is also key for ales. Weak entity set is connected to a normal entity via many one (actually, many to exactly one) relationship. weak entity doesn't have a key sometimes. Entity is more than a name of a thing and has at least one non key attribute and it is many in many one relationship. if you can create a unique id, you don't need weak entities. if not, it's unavoidable.
multiplicity : many-many...etc
referential integrity: when connection of entities is made, referenced entity should exist.
logical design phase converts ER model to Relational model.
relations are normalized for ease of implementation and quality of database.
relation has a name and schema(definition, heading of a relation). key of a relation is the minimal set of column(attribute)s that identifies each tuple(row)s.
cartesian product, relation-subset of cartesian product?
arity: number of domains. cardinality: number of tuples <= product of cardinality of each domain.
database is set of relations. database schema is set of all relation schemae in the database.
intention vs. extension : database schema vs. database
sql is built upon relations cause it's simple and similar to our thinking but it uses bags instead of sets.
Entity set -> relation directly with all attributes included.
Relationship -> relation with only key attributes of connected entities and attributes of itself.
key is minimal. primary key is the selected key for a relation.
foreign key is borrowed key from another entity.
one-one relationship relation can take a key from either one of the entities.
renaming of key is essential in some cases.
if many-one relationship is present, you can combine the two entities.
Drinkers(name,addr) + Favorite(drinker-name,beer-name) = Drinkers1(name,addr,favBeer). but redundancy can come easily. do not combine many-many.
Relation for weak entity set is created with borrowed keys from parents.
weak relationship is absolved into weak entity set relation.
subclasses can be converted using three different approaches.
object oriented: create relations for each class(entity). subclass gets all attributes from mother class. subclass entity instance does not appear on the mother class's relation
ER style: create relation for each class. but subclass does not inherit attributes from mother class. subclass instance appears in mother class relation.
nulls: create one relation and put null for mother in attributes of subclass.
using nulls saves space unless many attributes are null.
multi-valued attribute: hobby can be list of hobbies. usually gets separate relation.
composite attribute: date is composite of day, month, year. can be stored in separate attributes like date as day and month...if you access one atomic attribute of composite, then separate that out.
derived attribute: age can be derived from dob. not stored separately but calculated on the fly.
ternary relationship relation has combined keys from 3 entities.
isa relationship relation can be made with only subclasses or with mother class also. if you do not create a relation for mother, then you have to inherit non key attributes from mother.

midterms

they are not xterm, aterm, eterm...etc. no ordinary term.
they are midterms.
why
did
they
make
such
things
i have to study to feel confident during the exam