CS50 — Week 4, part 2

CS50 — Week 4, part 2

Pointers

  • alternative way to pass data between functions
    • up to this pt, we’ve passed data by value (aka only passing a copy of that data)
  • with pointers, we can pass the actual variable itself into a function
    • so a change made in one function can impact what happens in a different function

computer memory

  • every file on your comp lives on your disk drive (hard disk drive HDD, solid-state drive SSD, etc.)
    • disk drives = storage space; we can’t directly work (manipulate, use data) there
    • so we have to move data to RAM to work with it (the memory I’ll be dealing with as a programmer)
    • every time the computer is turned off, RAM is wiped (thus the need to store data in disk drives)
  • memory = huge array of 8-bit wide bytes (512 MB, 1 GB, 2 GB, 4 GB…)
  • every int I create requires 4 bytes, a char requires 1 byte, a double or long long 8 bytes… etc.
  • so, memory is a big array, and arrays can store info and allow for so-called random access
    • aka it’s unnecessary to iterate thru every element until we come to the one we want – we can just indicate the index location (usually b/c of size referred to in hexadecimal rather than binary notation)
  • every location in memory has an address, and a pointer is nothing more than an address!!!

back to pointers

  • pointers are addresses to variables stored there
  • a pointer’s value = a memory address
  • a pointer’s type = the kind of data stored at that memory address
  • simplest pointer in C = the NULL pointer (which points to nothing)
  • when creating a pointer and not immediately setting its value (to some meaningful address), always set the value of the pointer to NULL
    • check using the equality operator if (pointer == NULL)
  • easy way to create a pointer = “extract” the address of already existing variable with the address extraction operator&
    • ex. if x is an int-type variable, then &x is a pointer-to-int whose value is the address of x
    • if arr is an array of doubles, then &arr[i] is a pointer-to-double whose value is the address of the i‘th element of arr
      • an array’s name = actually a pointer! (to its 1st element)
  • why do we care about where some variable lives in memory?
  • with the dereference operator *, we can look at and modify the data at the location
    • ex: if we create a pointer-to-char called pc, then *pc is the data that lives at the memory address stored inside the pointer variable pc
      • if we do *pc = D, we are assigning a new value to the data at address pointed to in pc
  • * in this context “goes to the reference” and accesses the data at that memory location (like how knowing a neighbor’s address isn’t enough to allow you to hang out with them – you have to go to their house first)
  • what happens if we try to dereference a pointer whose value is NULL? –> a segmentation fault! (can’t go to “nothing”!)
    • this is good behavior, b/c it defends against accidental dangerous manipulation of unknown pointers
    • better for your program to crash than screw up another function or another program (possibly outside of yours)
    • segmentation faults = segfault = when we try to touch memory we shouldn’t, aka a program trying to read or write an illegal memory location
  • int* p;
    • value of p is an address
    • dereferencing p (*p) will give the value at address p, which will be of type int
  • int x, y, z;
    • statement declaring 3 integer variables
  • to declare multiple pointers of the same type on the same line:
    • int* pa, *pb, *pc;  …each needs its own star!
    • pointers are a syntactic messy mess and it’s easy to make mistakes with them

twinkle twinkle little star

  • type name (as in, char*, a pointer to the 1st char in a “string”)
  • dereference operator (“go to” that address to get at that data)

64-bit machine vs. 32-bit

  • in a 32-bit machine, every memory address is 32-bits long, or 4 bytes
  • in a 64-bit machine, addresses in memory are 8 bytes
  • since pointers are just addresses, they too will either be (no matter the data type) 4 or 8 bytes in size

Screen Shot 2017-09-28 at 2.41.58 PM

Dynamic Memory Allocation

  • rather than having to know exactly how much memory we’ll need at compile time (b/c of various user input over long periods of time, for example), we can get access to dynamically allocated memory at run time (aka, as our program runs)
  • dynamically allocated memory comes from a pool of memory called the heap
  • up to now we’ve been working with named variables we create that come from pool of memory called the stack
  • if the heap and the stack run into each other, the program essentially runs out of memory!
  • standard library function malloc()
    • parameter = number of bytes requested
    • if it can obtain the requested memory for you, malloc() will return a pointer to that memory
    • if it can’t, it’ll hand back NULL
  • statically obtain an integer:
    • int x;
  • dynamically obtain an integer:
    • int *px = malloc(sizeof(int));
    • 4 bytes of memory from the heap, pointer called px, which we can dereference to access the data there
  • when using dynamically allocated memory, MUST remember to “hand back” that memory explicitly b/c it’s not automatically returned to the system for later use when the function in which it’s created finishes execution like static memory
    • failure to do so results in running out of memory or in memory leaks which can compromise performance
    • use free() that accepts pointer to some memory (but only do it one time or else weird things might happen)

char* word = malloc(50 * sizeof(char));
// do stuff with word
// now we're done working with that block of memory
free(word);

Structures

  • arrays have allowed us to unify a bunch of elements of same type
  • structures allow us to bring together variables of different types
    • all put together into a new variable type w/ its own name
  • a “super-variable”!

struct car
{
   int year;
char model[10];
char plate[7];
int odometer;

};

  • usually define structure in separate header (.h) files or atop our programs outside of any functions
  • access the fields of the structure with the dot operator.

// variable declaration
struct car mycar;

// field accessing
mycar.year = 2011;
strcpy(mycar.plate, "CS50");

mycar.odometer = 50505;

…in which the data type is struct car and the variable name is mycar

  • structures, like all variables, don’t need to be created on the stack, but can use dynamically allocated memory at run time
  • if we do this tho, 1st need to dereference pointer to the structure (aka “go to” the memory address storing the structure) and then access its fields

// variable declaration
struct car *mycar = malloc(sizeof(struct car));

// field accessing
(*mycar).year = 2011;
strcpy((*mycar).plate, "CS50");

(*mycar).odometer = 50505;

  • but C programmers like shorter ways to do stuff than having to use the * operator along with the . operator, so there’s this: -> (dereferences structure and accesses field all at once):

// field accessing
mycar->year = 2011;
strcpy(mycar->plate, "CS50");

mycar->odometer = 50505;

Defining Custom Types

  • way to write shorthand or rewritten names for data types w/ C keyword typedef
    • typedef <old name> <new name>;
  • top of .c file or in separate .h file
    • ex: in CS50.h, this line of code: typedef char* string; because at first it was easier to just think of them as strings and not worry about the char* bit
  • can define the structure in the middle of my typedef!

typedef struct car
{
int year;
char model[10];
char plate[7];
int odometer;
}
car_t;

  • now, when declaring my variable of this new struct type, I don’t have to write struct car mycar; but instead can use whatever new name I’ve assigned, like car_t mycar;

Recursion

  • a function that calls itself
  • an “elegant” solution because of how short they can make functions and how they can help avoid long loops or calling additional functions

an example from math class

  • factorial function n! is defined over all positive integers
  • n! = the product of all positive integers less than or equal to n
  • let’s define the mathematical function in programming terms as fact(n)

fact(1) = 1
fact(2) = 2 * 1
fact(3) = 3 * 2 * 1
fact(4) = 4 * 3 * 2 * 1
fact(5) = 5 * 4 * 3 * 2 * 1
...

which can be reduced to

fact(1) = 1
fact(2) = 2 * fact(1)
fact(3) = 3 * fact(2)
fact(4) = 4 * fact(3)
fact(5) = 5 * fact(4)
...

which is the same as saying

fact(n) = n * fact(n - 1)

  • for any recursive definition,
    • a base case – simple solution to problem that stops the recursion from continuing
    • the recursive case – passes work to different call down the line

int fact(int n)
{
   // base case
   // recursive case
}

int fact(int n)
{
   if (n == 1)
      return 1;
   else
      n * fact(n - 1);
}

hint #1: you don’t need curly braces following the conditional statement (like ifelse) when there’s only one line of code inside!

hint #2: we’re returning 1 on the base case b/c the factorial of 1 = 1 😉

  • also possible to have more than one base or recursive case if the program might recurse or terminate in different ways depending on input

example of multiple base cases

  • the Fibonacci number sequence:
    • 1st element is 0
    • 2nd element is 1
    • nth element is sum of (n – 1)th and (n – 2)th elements

int fibo(int n)
{
   if (n == 1)
      return 0;
else if (n == 2)
return 1;
else
fibo(n - 1) + fibo(n - 2);

}

example of multiple recursive cases

  • the “Collatz conjecture” – applies to positive integers and speculates it’s always possible to get “back to 1” by following steps:
    • if is 1, stop
    • otherwise, if is even, repeat this process on n/2
    • otherwise, if is odd, repeat this process on 3+ 1
  • recursive function collatz(n) that calculates # of steps to get to 1 when starting from n:

 

int collatz(int n)
{
// base case
if (n == 1)
return 0;
// even numbers
else if (n % 2 == 0)
return 1 + collatz(n/2);
// odd numbers
else
return 1 + collatz(3*n + 1);
}

The Call Stack

  • when a function is called, the system sets aside space in memory for that function to do its work
    • these chunks of memory are frequently called stack frames or function frames
  • more than one function’s stack frame may exist in memory at a given time
    • ex: if main() calls move(), which then calls direction(), all 3 functions have “open frames” – but only the most recently called function has an “active” frame
  • when a new function is called, a new frame is pushed onto the stack and becomes the active frame
  • when a function finishes, its frame is popped off the stack, and the frame immediately below becomes the new active function on the top of the stack, picking up where it left off in its executable steps

File Pointers

  • the data structure FILE in C represents any files you can click on in familiar GUI world
  • the ability to read data from and write data to files = primary way of storing persistent data (aka data that does not disappear when my program stops running)
    • if something went wrong, you can review what happened, for example
  • almost always when working with files, we’ll be using pointers to them in FILE* form
  • file manipulation functions all live in stdio.h (along with printf())
  • fopen()
    • opens a file and gives file pointer to it
      • FILE* ptr = fopen(<filename>, <operation>);
      • FILE* ptr1 = fopen("file1.txt", "r");
      • in this example, I’ve created a pointer-to-file called “ptr1”, and what it’s pointing to is a new text file I’ve just opened called “file1.txt” and the operation is read so we can read information from the file
      • "w" for write, which will write information to the file (or overwrite if info was already written there)
      • "a" for append, which will write new information to the end of the file
      • only one operation with one file pointer!
      • if we want to do 2 different operations, we have to create 2 separate pointers that point to the same file!
    • always check return value for NULL (even tho vast majority of time we’ll have gotten a legitimate pointer back)!!
    • fclose()
      • closes file, pass in name of the file pointer
      • fclose(ptr1);
    • fgetc()
      • reads and returns next character from file pointed to (if it’s the 1st time fgetc() is being called for a file, it’ll be the 1st character in the file)
      • when opening the file, the operator passed in as a parameter must be "r" for read, or there’ll be an error!
      • char ch = fgetc(ptr1);
    • fputc()
      • writes or appends character to pointed-to file
        • fputc(<character>, <file pointer>);
      • again, operation of the file pointer must be "w" for write or "a" for append, or there’ll be an error
    • fread()
      • since getting characters one at a time with fgetc() isn’t very efficient, fread() allows us to get an arbitrary amount of info from a file
      • reads <qty> units of size <size> from the file pointed to and stores them in memory in a “buffer” (usually array) pointed to by <buffer>
        • fread(<buffer>, <size>, <qty>, <file pointer>);
        • int arr[10];fread(arr, sizeof(int), 10, ptr);
        • double* arr2 = malloc(sizeof(double) * 80);
          fread(arr2, sizeof(double), 80, ptr);
        • char c;fread(&c, sizeof(char), 1, ptr);
      • operation of file pointer must be "r" for read!
      • the 1st parameter is an address! so while we can just use the name of an array (because that is a pointer), we have to use the address extraction operator if we’re just dealing with a regular variable like a char
    • fwrite()
      • writes <qty> units of size <size> to the file pointed to by reading them from a buffer (usu. array) pointed to by <buffer>
        • int arr[10];fwrite(arr, sizeof(int), 10, ptr);
        • in this case, there’d be some numbers stored in the array arr that you wanted to write to your file that’s pointed to by ptr
      • operation of file pointer must be "w" or "a"
  • and there’s more:

Screen Shot 2017-09-29 at 11.58.46 AM


Today’s Resources:

https://stackoverflow.com/questions/5484624/how-to-understand-the-pointer-star-in-c

https://courses.edx.org/courses/course-v1:HarvardX+CS50+X/courseware/

Advertisements
CS50 — Week 4

CS50 — Week 4

Strings Are a Lie!

  • what we’ve been treating as type “string” in CS50’s ID are really just memory addresses of the first char in any given string
    • so how does the computer find the “string” I want if I, say, want to printf it?
    • computer goes to the memory address of first char and reads chars in all subsequent memory addresses until coming to char \0 which denotes end
  • essentially a string is really just a pointer
  • string = synonym for data type char *
    • there is no string!!

The Star Operator

  • the “go there” operator (also called the “dereference” operator)
  • the address of whatever type comes before – aka a pointer to that data
  • i.e. the way we type out “the address of a char” = char *

Pointers

  • pointers are just addresses where some data is stored!
  • when using pointers, be careful:
    • give the pointer a pointee (they are two separate things, so a pointee is not set up automatically) *x = malloc(sizeof(int));
    • dereference a pointer to access its pointee *x = 42;
    • assignment (=) between pointers give them access to the same pointee (called “sharing” a pointee) y = x;
  • my own question: in these examples, then, are x and y the pointers and *x the pointee?

 Pointer Arithmetic

  • math with pointers!

Memory Leak

  • if I start dynamically allocating memory in my program, I need to worry about freeing up that allocated memory once I’m done using it (or else my program may grow much more costly in terms of memory than it needs be)
  • asking for memory and then never giving it back – memory leaks – can cause a Mac or PC to slow
  • we can use valgrind ./program name to show us memory leaks (if they otherwise seem fine, aka are able be made)

The Heap and The Stack

  • the “stack” refers to functions that are called
    • on the bottom is main()
    • stacked on top are any functions called within main(), functions called within those functions, etc. etc.
    • once a function is finished (aka has gone thru all its steps/returned some value), the computer can “forget” about it and it’s taken off the stack, so to speak
  • the “heap” refers to memory allocations that are made
    • the top would be the first malloc()and from there down would any other malloc() be
  • stack overflow
    • when the stack grows up too far
    • a comp can be hacked and damaged if a hacker can take control using a memory mistake I’ve made!
  • heap overflow
    • when the heap grows down too far
  • both are examples of what’s called buffer overflow, where you are trying to use more memory than you should
  • to prevent from hacking attacks & server compromise (at least in one way), the programmer needs to make sure that when using a chunk of memory, she checks the bounds so that no one can go in and overwrite parts of code with their own…
    • ex: take a command line argument to be put into some char * of size 10 but then don’t check (maybe with an if() statement, for example) that a user doesn’t input more than 10 chars
    • the user could then input enough that (heap’s going down, remember) the return values (little breadcrumbs comp has given itself as reminders of where to return to after current function is finished) are overwritten with the user’s own data, thereby never allowing the comp to return to the place in the program but rather possibly sending it to a user’s own malicious program, etc.

Images!

Bitmap – BMP

  • literally a map of bits with corresponding color values
Screen Shot 2017-09-23 at 11.12.56 AM
a very basic smiley face bitmap, where 1s are white and 0s are black

JPEG

  • what Facebook, cameras use
  • stores potentially millions of colors
  • allows for compression – throw away some of the 0s and 1s in order to decrease the file size for easier storage, emailing, etc.
  • all JPEG files start with the same 3 bytes: 255 216 255

Hexadecimals

  • when speaking of images & colors, decimal and binary give way to hexadecimals
  • 16 digits, from 0 – 15, borrowing alphabet letters after 9
    • 0 1 2 3 4 5 6 7 8 9 a b c d e f
  • 0xis simply a convention to signal that some hexadecimals are thereafter following
    • 0xff = 255 in decimal, 11111111 in binary
  • so the JPEG starting bytes in hexadecimal are: 0xff 0xd8 0xff

What is a File?

  • a bunch of bits, stored somewhere (on a laptop or desktop or phone, etc.)
  • what does it mean to be a Microsoft Word file vs. an Excel file vs. a JPEG vs. mp3…
    • first bits follow some pattern, demarcating that file type

struct

  • in C, allows you to create some structure inside of which you store information

.csv

  • “comma separated values” files
  • simple Microsoft Excel, Google Spreadsheets, Apple Numbers can export these
    • just series of values separated by commas

 


Today’s Resources:

https://video.cs50.net/2016/fall/lectures/4?

CS50 – Week “3”

CS50 – Week “3”

Of course, it hasn’t really been only 3 weeks since I began this course. But continue I will.

Linear Search

  • find sol. left to right or right to left

for each element in array

   if element you're looking for

       return true

return false // very last step in program, if nothing else returns a value

Binary Search

look at middle of array

if element you're looking for

    return true

else if element is to left

   search left half of array // a "recursive call" - to call itself, to use its definition again and again

else if element is to right

    search right half of array 

else

    return false

  • algorithm works only if the array is sorted
  • so if something is unsorted, what array can we use to sort?

bubble sort

repeat until no swaps

 for i from 0 to n-2 // up thru the second to last element (last element is n-1, n being the number of elements) b/c second to last element is last one you can compare with the one after it

      if i'th and i+1'th elements out of order

         swap them

  • computer goes thru looking at two elements at a time and fixes all pairwise mistakes – which doesn’t solve the sorting problem the first run thru, but reduces the number of problems by bubbling up the greatest value to the end thereby allowing the comp to ignore those values

selection sort

for i from 0 to n-1 // aka the last element

    find smallest element between i'th and n-1'th

    swap smallest with i'th element

  • walking thru the list, finding the smallest and putting it on the end

insertion sort

for i from 1 to n-1 // deciding that 0'th element, being in a list of one, is already sorted

    call 0'th thru i-1'th elements the "sorted side"

    remove i'th element

    insert it into sorted side in order

  • deal with the problem like there’s a left half and a right half, inserting elements in order

judging efficiency

  • if n is the number of elements to sort, how many steps will it take?
    • for example, bubble sort takes n^2/2 – n/2
    • taking the biggest magnitude term, let’s focus on n^2/2
    • for example, if there are 1 million elements, it’s basically 500 billion
  • O = upper bound (number of steps in a “worst case” scenario of the most work created), what is looked at when determining “running time”
    • examples:
      • O(n^2) ((not great)) — bubble sort, selection sort, insertion sort
      • O(n log n)
      • O(n) — linear search
      • O(log n) — binary search
      • O(1) – a constant number of steps — checking an if statement, for example
  • Ω = lower bound (# of steps in best case/lucky scenario aka list is already sorted) ((“omega”))
    • examples:
      • Ω(n^2) — selection sort
      • Ω(log n)
      • Ω(n) — bubble sort
      • Ω(log n) — this, and Ω(1) mean you would have sorted the elements in less than steps, which means you wouldn’t have even looked at each element once, which is impossible (when talking about sorting problems)
      • Ω(1) — linear + binary search
  • if upper bound running time and lower bound running time are equal, you can claim program to have running time of Θ (“theta”)
  • with a greater number of elements, O(n^2) feels really slow… so is that the best we can do to sort elements?

merge sort

on input of n elements

   if n < 2

      return

   else

      sort left half of elements

      sort right half of elements

      merge sorted halves

  • demonstrative of a class of algorithms that can do better than O(n^2) using recursion
  • keep using merge sort to “sort left half” and “sort right half”
  • runtime of O(n log n) — any time you keep cutting a problem in halves it’s O(log n) and then on top of that, on each step of halving comp had to run thru all elements

Computational Complexity

  • a relatively math-heavy topic
  • how does this algorithm scale when we throw more and more data at it?
  • there’s time complexity and space (memory) complexity
  • generally when talking about an algorithm we’re talking about worst-case (as opposed to best-case Ω)
  • let’s call it f(n), where is number of elements in data set and f(n) is how many units of resources it requires to process that data

Today’s Resources:

https://www.edx.org/course/introduction-computer-science-harvardx-cs50x

Well, summer break happened.

Well, summer break happened.

In the teaching world, summer break is a thing. And here in Japan, it’s basically all of August. Nobody really likes studying over vacation (except perhaps the Japanese). That being said, I could have continued my studies in computer programming. Sure, it would have been hard – there’s no wifi on the beach, after all. But I could have made some effort.

I did not.

So here I am, after a month-long break. I’ve relaxed, I’ve procrastinated a (maybe more than) tiny bit, and now I’m back. Ready to go. Ready to embrace my uber-noobness and struggle through what seem, on the surface, to be the simplest of problems (CS50 pset2, anyone?).

For, as it has been said by many wise people in many ways, the only way I can fail at this is if I give up.

Vocabs and Stuffs

Vocabs and Stuffs

Parsing

  • use when you have a big chunk of data (text, cryptography, phone numbers, etc.)
    • phone number – parse first 3 digits to get the area code
    • text – parse into words/sentences, etc.
  • to parse is to go thru, chop up data into smaller pieces to identify and compare – going thru piece by piece
  • breaking data into meaningful pieces
  • unit of meaning

Iterate

  • go thru a loop and do the exact same thing over and over again
  • an iteration – a frame in a loop
  • “iterate over” – pass thru each part one frame at a time
  • unit of math
  • can iterate thru words, but not until you’ve parsed some text into words

Delineate

  • to identify a break in the meaning – to separate, to outline
    • ex: spaces delineate words so I can parse text into words

Organization When Coding

  • start with an outline, big picture
  • then fill in functions with their definitions
  • start with brain’s way of understanding, then translate into code
    • English -> psuedo-code -> code

Recursion

  • when you take in some input, get output, and put that output back thru (however many times)

F(x) = 2x

F(F(x)) = 2(2x) = 4x

  • in this example, both xs are local – they are not the same but are distinct from each other
  • a recursion example: a function taking in itself (tricky b/c of scope and accidental infinite loop)

Today’s Resources:

My older brother.

CS50 — Week One Notes, part 2

CS50 — Week One Notes, part 2

I had to break these up into two different days because it was information overload.

Side note to self: If I’m ever in my workspace and I accidentally made a never-ending program that’s trying to force me into infinite loops of the same banal GetInt() function, use ctrl-z!!!

Conditional Statements

  • conditional branches/expressions/statements allow programs to make decisions depending on variable values or user input

if (boolean expression)

{

}

  • (mouse down) (x < 10) etc.
  • boolean expression evaluates to true? all lines of code between curly braces will execute from top to bottom
  • boolean expression false? those lines of code will not execute

if (boolean expression)

{

}

else

{

}

  • boolean expression evaluates to true? all lines of code between 1st set of curly braces will be executed, top to bottom
  • false? follow instructions in 2nd set of curly braces
  • I can create chains of mutually exclusive branches by using if, else if (* n), else with as many different boolean expressions
  • I can layer on as many if statements as I like (creating a chain of non-mutually exclusive branches), but the else will only bind to the nearest if

switch()

  • allows me to specify distinct cases rather than having to rely on boolean expressions
  • must break between each case, or I’ll “fall thru” each case and execute them all in order (unless that’s what I’m trying to do)

int x = GetInt();
switch(x)
{
case 5:
printf("Five!\n");
case 4:
printf("Four!\n");
case 3:
printf("Three!\n");
case 2:
printf("Two!\n");
case 1:
printf("One!\n");
default:
printf("Blast off!\n");
}

The ternary operator ?:

  • useful when writing very short if-else statements
these 2 code snippets act identically
int x = GetInt();
if (b == 0)
{
x = 5;
}
else
{
x = 6;
}
int x = GetInt();
x = (b == 0) ? 5 : 6;

Loops

  • allow programs to execute lines of code repeatedly (saves me from needing to copy/paste or repeat lines of code)

while (true)

{

}

  • infinite loop: lines of code between curly braces will execute repeatedly top to bottom forever (because true is always true)

while (boolean expression)

{

}

  • if boolean expression evaluates to true, all lines of code within curly braces will execute repeatedly top to bottom, until the boolean expression evaluates false

do

{

}

while (boolean expression);

  • executes all lines of code in curly braces, then checks boolean expression – if true, goes back and repeats code block until boolean exp. evaluates to false

for (int i = 0; i < 10; i++)

{

}

  • repeats the body code a specified number of times (above, for example, 10 times)
  • what the parts?!
    • the counter variable(s) (above, i) is set (above, to 0)
    • the boolean expression is checked (above, i < 10)
      • if true, body code executes
      • if false, body code does not execute
    • after getting to the end of the body code, the counter variable i is incremented (above, by 1, using the shorthand var++)
    • the boolean exp. is checked again and either goes thru the body code again or does not
  • more generally: for (start; expr; increment) where there can be more than one statement in start and increment

 

CS50 — Week One Notes

CS50 — Week One Notes

From Scratch to C.

  • same elements, different language to implement
  • write a bit of code, run/test it, then move on once its debugged (vs trying to write all your code at once and then not really knowing which part is making the whole thing crash, etc.)
  • any time I’m copying and pasting code from one place to another without changing a thing, there’s probably a better way to do it (a cleaner design)
  • source code –> pass as input to a compiler –> compiler turns it into source code that computer can read
    • “compiling” = umbrella term
    • preprocessing
      • find and (re)place
      • ex. “#include”s in C are in this step
    • compiling
      • source code –> assembly code
      • assembly code = arcanely worded sets of instructions that sets of 0s and 1s are mapped to
    • assembling
      • assembly code –> 0s and 1s
    • linking
      • computer gets the 0s and 1s from the code I wrote in my .c file, CS50 staff’s 0s and 1s from what they wrote in the CS50 library, printf()’s 0s and 1s from the standard input/output library, etc. (whatever was used to write my program) and combines together in my program “hello” (or w/e the name is)

The imprecision of number representation due to a computer’s limitations

  • we humans can think of and create an infinite number of numbers
  • computers are limited by how much memory’s available in their RAM
    • think about it – without any tricks or extra code, a byte can only store integers of value 0 thru 255!
  • we humans need to be aware of this limitation
    • ex: when The Patriot anti-air missile system was programmed to count time by 1/3 of a second, and so after running for many hours straight its timing was off and lives were lost
  • by creating a simple program in C, we can see that the computer’s representation of the decimal created by dividing 1 from 10 (should be 0.10) proves itself to be fundamentally imprecise when taken to too many decimal points
    • printf(“n is %.55f”, 1.0 / 10.0); yields “n is 0.1000000000000000055511151231257827021181583404541015625”

The Linux Command Line

  • Mac OS is a descendent of the Unix operating system (“a Unix-based system”); Linux was developed independently but is similar to Unix; Windows was built from ground up in a completely different way
  • the default command line in the CS50 IDE is terminal called “workspace”
  • shell = software interface (often a command line interface) that allows users to interact with the computer
  • bash = “Bourne-Again Shell”, the default shell environment in Linux and Mac OS X

name of command <argument>

ls

  • short for “list”, gives readout of all files and folders in current directory
    • blue = directory (that I can navigate into)
    • black = text or source code file
    • green = executable file

cd <directory>

  • short for “change directory”, changes current directory to (a specified)<directory> in my workspace or on my operating system
    • shorthand for current directory = .
    • shorthand for parent directory of current directory = . .
    • pwd (present working directory) gives name of current directory
    • cd on its own will bring me back to original directory location
  • if the name of a directory has multiple words separated by spaces, must enclose name in quotes when calling it up

mkdir <directory>

  • short for “make directory”, creates a new subdirectory called <directory> located in the current directory
  • analogous to: in GUI operating system, right click and in prompt click “New Folder”

cp <source> <destination>

  • short for “copy”, creates duplicate of file I specify as <source>, which it saves in <destination>
  • cp -r <source directory> <destination directory>
    • copies entire directories (“-r” flag stands for recursive, tells cp to copy everything inside directory, including any subdirectories it may contain)

rm <file>

  • short for “remove”, deletes <file> after asking me to confirm (y/n)
  • rm -f <file>
    • skips the confirmation (but can’t undo so beware)
  • rm -r <directory>
    • deletes entire directory (can combine with -f flag: “rm -rf <directory>”)

mv <source> <destination>

  • short for “move”, renames file, moving it from <source> to <destination>

chmod <options> <permissions> <file>

  • short for “change mode”, changes the permissions of files or directories (aka who can access the file and how they can access it)
  • options are like the -f flag we saw before, plus others
  • permissions can be defined for:
    • user (owner of file)
    • group (members who own the file with you)
    • others (anyone else)
  • permissions can include:
    • read (4), write (2), execute (1), no permission (0)
  • example in symbolic permissions notation:

    chmod u=rwx, g=rx, o=r myfile

  • same example in octal permissions notation:

    chmod 754 myfile

ln <target> <linkname>

  • short for “link”, creates links between files
  • if <linkname> is omitted, link to <target> is created in current directory using the name of <target>
  • different from cp – same data is being pointed to by both file names
    • if go into data and change it from either, the data is changed for both
    • if one of the files is deleted, data is still safe in other file which points to it
  • use the tag -s to create symbolic links (file points to other file/directory where data/files are stored)

touch <file>

  • changes file timestamps (access, modification times) to current system time

rmdir <directory>

  • short for “remove directory”, removes directories that are empty

man <argument>

  • short for “manual”, shows the system’s reference manual for that argument (usually a program, utility or function)

diff <file> <file>

  • short for “different”, analyzes two files and outputs instructions to change 1st file to match the second

sudo <command>

  • short for “superuser do”, allows a permitted user to execute command as another user

clear

  • clears the screen
  • same as ctrl + l in bash

telnet

  • short for “TELNET protocol”, used for communication with a remote host or device
    • ex: “telnet hostname” would connect user to a host named hostname
    • unsafe to transfer data in this way because others could grab data as it’s being transmitted

Data Types

  • in C, an older language, we have to specify each data type before we can work with it

Native to C:

int

  • used for variables that will store integers
  • integers always take up 4 bytes (32 bits) of memory
    • roughly, that means we can store -2 billion to 2 billion
  • unsigned int
    • unsigned = qualifier that doubles positive range at the expense of disallowing negative values
      • other qualifiers = long, short, const
    • if I know my value will never be negative, I can use this to count up to ~4 billion

char

  • used for variables that will store single characters
  • characters always take up 1 byte (8 bits) of memory
    • thanks to the ASCII system, different numbers have been assigned to characters (ex: “0” = 48)

float

  • used for variables that will store floating-point values (real numbers)
  • floating point values always take up 4 bytes (32 bits) of memory
    • some of these bits are used for the integer part and some for the decimal portion
    • we are limited to how precise we can be

double

  • used for variables that store floating-point values
  • double precision -> always take up 8 bytes (64 bits) of memory

void

  • a type, but not a data type (can’t assign a value to it)
  • void functions don’t return a value
  • a function’s parameter list can also be void if the function takes no parameters
    • like “main”, when we haven’t been passing any arguments into main (we could tho)

Thanks to the CS50 Library:

bool

  • used for variables that store a Boolean value (true or false)
  • booleans are a standard default (native) data type in many modern programming languages, but not in C

string

  • used for variables that will store a series of characters (a “string”)
  • can be words, sentences, paragraphs or even books!
  • not a native data type in C, so doesn’t come up as purple in the IDE

And later on…

structs

  • used to group ints and strings into one unit

typedefs

  • “defined types” that allow me to create my own data types

Variables

  • to create (“declare”), specify the data type, give it a name, and slap a semicolon to the end
  • to create multiple variables of same type, specify the type once and then list as many variables as desired
    • int height, width;
      • creates two integer type variables with respective names “height” and “width”
  • good practice = create variable right when I need it
  • after a variable’s been declared, no need to specify variable type again
  • simultaneously declaring and setting the value of a variable? (“initializing”)
    • int price = 17; // we’ve specified an integer, named the variable “price” and assigned it a value of 17

Operators

  • in C, we can add, subtract, multiply and divide numbers, as well as get the remainder of left number divided by right (modulus)
    • int m = 13 % 4; // m is now 1 (13 / 4 = 3.25 or 3 1/4)
    • the modulus operator is more useful than you’d think (like in a random number generator, whereby you want to limit the range of the random number you generate)
  • shorthands
    • x = x * 5 –> x *= 5; // works with all 5 basic operators
    • x++; // incrementing variable by 1
    • x–; // decrementing variable by 1
      • x = x + 1 == x += 1 == x++

boolean expressions

  • in C, every nonzero value = true and every zero = false
  • don’t always have to use type bool when we are working with boolean expressions
  • logical operators
    • logical AND (&&) is true if and only if both operands are true, otherwise false
      • if x && y {..go down this path..}
      • x is true and y is true? true
      • x is true but y is false? false …etc.
    • logical OR (||) is true if and only if at least one operand is true, otherwise false
      • if x || y {…}
    • logical NOT (!) inverts the value of its operand (called “bang”, “exclamation” and “not”)
      • x is true? !x is false (and vice versa)
  • relational operators
    • less than (x < y)
    • less than or equal to (x <= y)
    • greater than (x > y)
    • greater than or equal to (x >= y)
    • equality (x == y)
    • inequality (x != y)