Homework 3: SQL

October 2003

Due: Sunday, November 2, 2003 (7pm)

To be done individually!!

1. Introduction

In this assignment, you will write SQL queries that answer questions about a database containing information about players, teams and games from the National Basketball Association (NBA). Yahoo Sports (http://sports.yahoo.com/nba/) and ESPN (http://sports.espn.go.com/nba/) are examples of web sites that use such a database. We have simplified the schema by maintaining information for only last season[1].

Some notes to keep in mind as you go:

While we have tried to limit the possible number of correct solutions to each question, keep in mind that there still may be multiple correct solutions. For each of the questions, there is a relatively short solution.
Unless specified in the question, your queries should produce the correct answer for any instance of the relations. We have provided a script that will contain DDL for creating a sample database with some test data, but your queries should work over any legitimate instance of the relation.
Unless specified in the question, return the entire tuple. E.g. “all games” would mean return the entire game tuple.
Unless specified in the question, do not remove duplicates.
Return the fields in the order we specify.
Note that the dataset provided is minimal. We will be testing your queries on a much larger dataset, so be sure to add to the one provided to make sure your queries are in fact correct.

From your home directory, run the command:

gtar -zxvf /home/cc/cs186/fa03/Hw3/Hw3.tar.gz

This will create a directory Hw3/ in your home directory with the following scripts:

Script	Description
initdb.sh	Initialize database directory at $PGDATA_HW3. To reinitialize, you have to delete existing $PGDATA_HW3 first.
startpg.sh	Starts Postgres master process for database directory at $PGDATA_HW3
loadnba.sh loadbignba.sh	Creates database ‘hw3’, and loads in some test data. The test data from loadbignba.sh is based on real data from this year’s NBA roster. We have also provided a smaller data set that you can load using loadnba.sh. Feel free to insert in your own data for your own testing. Read loaddata.sql to see how to use the copy command to bulk load data.
startpsql.sh	Starts psql for hw3
runquery.sh <QUERY_FILE>	Runs query stored in <QUERY_FILE> using psql and outputs to screen. We have provided a sample query file query0.sql
stoppg.sh	Stops Postgres master process for the database directory at $PGDATA_HW3 before you log off.

To set up the initial database for the very first time, you should run:

initdb.sh, followed by
startpg.sh and
loadnba.sh.

To start up psql, type:

psql hw3

Some of these scripts are simply there for convenience to ensure that you run our version of pg_ctl, psql, initdb, createdb, etc and not your compiled version from previous homework. You can choose not to use our scripts (at your own risk!) and create your own database (using our schema.sql) as you may have done in hw0.

2. Schema

There are 4 relations in the schema, which are described below along with their integrity constraints. Columns in the primary key are underlined.

Player(playerID: integer, name : varchar(50), position : varchar(10), height : integer, weight : integer, team: varchar(30))

Each Player is assigned a unique playerID. The position of a player can either be Guard, Center or Forward. The height of a player is in inches while the weight is in pounds. Each player plays for only one team. The team field is a foreign key to Team.

Team (name: varchar(30), city : varchar(20))

Each Team has a unique name associated with it. There can be multiple teams from the same city.

Game (gameID: integer, homeTeam: varchar(30), awayTeam : varchar(30), homeScore : integer, awayScore : integer)

Each Game has a unique gameID. The fields homeTeam and awayTeam are foreign keys to Team. Two teams may play each other multiple times each season. There is an integrity check to ensure homeTeam and awayTeam are different.

GameStats (playerID : integer, gameID: integer, points : integer, assists : integer,
rebounds : integer)

GameStats records the performance statistics of a player within a game. A player may not play in every game, in which case it will not have its statistics recorded for that game. gameID is a foreign key to Game. playerID is a foreign key to Player. Assume that two assertions are in place. The first is to ensure that the player involved belongs to either the involving home or away teams, and the second is to ensure that the total score obtained by a team (in Game) is consistent with the total sum (in GameStats) of individual players in the team playing in the game[2].

3. Queries

Part I

1. Find distinct names of players who play the “Guard” Position and have name containing “Jo”. (ORDER BY Players.name)

select list: Player.name

ordering: Player.name ascending

2. List cities that have more than 1 team playing there. (ORDER BY Team.city)

select list: Team.city

ordering: Team.city ascending

3. Find the player(s) who has the highest score for this season (highest total score in all games of this season). Output the player’s ID, name, team, and total score.

select list: Player.playerID, Player.name, Player.team, total_score