CS50 — Week 4

CS50 — Week 4

Strings Are a Lie!

  • what we’ve been treating as type “string” in CS50’s ID are really just memory addresses of the first char in any given string
    • so how does the computer find the “string” I want if I, say, want to printf it?
    • computer goes to the memory address of first char and reads chars in all subsequent memory addresses until coming to char \0 which denotes end
  • essentially a string is really just a pointer
  • string = synonym for data type char *
    • there is no string!!

The Star Operator

  • the “go there” operator (also called the “dereference” operator)
  • the address of whatever type comes before – aka a pointer to that data
  • i.e. the way we type out “the address of a char” = char *

Pointers

  • pointers are just addresses where some data is stored!
  • when using pointers, be careful:
    • give the pointer a pointee (they are two separate things, so a pointee is not set up automatically) *x = malloc(sizeof(int));
    • dereference a pointer to access its pointee *x = 42;
    • assignment (=) between pointers give them access to the same pointee (called “sharing” a pointee) y = x;
  • my own question: in these examples, then, are x and y the pointers and *x the pointee?

 Pointer Arithmetic

  • math with pointers!

Memory Leak

  • if I start dynamically allocating memory in my program, I need to worry about freeing up that allocated memory once I’m done using it (or else my program may grow much more costly in terms of memory than it needs be)
  • asking for memory and then never giving it back – memory leaks – can cause a Mac or PC to slow
  • we can use valgrind ./program name to show us memory leaks (if they otherwise seem fine, aka are able be made)

The Heap and The Stack

  • the “stack” refers to functions that are called
    • on the bottom is main()
    • stacked on top are any functions called within main(), functions called within those functions, etc. etc.
    • once a function is finished (aka has gone thru all its steps/returned some value), the computer can “forget” about it and it’s taken off the stack, so to speak
  • the “heap” refers to memory allocations that are made
    • the top would be the first malloc()and from there down would any other malloc() be
  • stack overflow
    • when the stack grows up too far
    • a comp can be hacked and damaged if a hacker can take control using a memory mistake I’ve made!
  • heap overflow
    • when the heap grows down too far
  • both are examples of what’s called buffer overflow, where you are trying to use more memory than you should
  • to prevent from hacking attacks & server compromise (at least in one way), the programmer needs to make sure that when using a chunk of memory, she checks the bounds so that no one can go in and overwrite parts of code with their own…
    • ex: take a command line argument to be put into some char * of size 10 but then don’t check (maybe with an if() statement, for example) that a user doesn’t input more than 10 chars
    • the user could then input enough that (heap’s going down, remember) the return values (little breadcrumbs comp has given itself as reminders of where to return to after current function is finished) are overwritten with the user’s own data, thereby never allowing the comp to return to the place in the program but rather possibly sending it to a user’s own malicious program, etc.

Images!

Bitmap – BMP

  • literally a map of bits with corresponding color values
Screen Shot 2017-09-23 at 11.12.56 AM
a very basic smiley face bitmap, where 1s are white and 0s are black

JPEG

  • what Facebook, cameras use
  • stores potentially millions of colors
  • allows for compression – throw away some of the 0s and 1s in order to decrease the file size for easier storage, emailing, etc.
  • all JPEG files start with the same 3 bytes: 255 216 255

Hexadecimals

  • when speaking of images & colors, decimal and binary give way to hexadecimals
  • 16 digits, from 0 – 15, borrowing alphabet letters after 9
    • 0 1 2 3 4 5 6 7 8 9 a b c d e f
  • 0xis simply a convention to signal that some hexadecimals are thereafter following
    • 0xff = 255 in decimal, 11111111 in binary
  • so the JPEG starting bytes in hexadecimal are: 0xff 0xd8 0xff

What is a File?

  • a bunch of bits, stored somewhere (on a laptop or desktop or phone, etc.)
  • what does it mean to be a Microsoft Word file vs. an Excel file vs. a JPEG vs. mp3…
    • first bits follow some pattern, demarcating that file type

struct

  • in C, allows you to create some structure inside of which you store information

.csv

  • “comma separated values” files
  • simple Microsoft Excel, Google Spreadsheets, Apple Numbers can export these
    • just series of values separated by commas

 


Today’s Resources:

https://video.cs50.net/2016/fall/lectures/4?

Advertisements
Vocabs and Stuffs

Vocabs and Stuffs

Parsing

  • use when you have a big chunk of data (text, cryptography, phone numbers, etc.)
    • phone number – parse first 3 digits to get the area code
    • text – parse into words/sentences, etc.
  • to parse is to go thru, chop up data into smaller pieces to identify and compare – going thru piece by piece
  • breaking data into meaningful pieces
  • unit of meaning

Iterate

  • go thru a loop and do the exact same thing over and over again
  • an iteration – a frame in a loop
  • “iterate over” – pass thru each part one frame at a time
  • unit of math
  • can iterate thru words, but not until you’ve parsed some text into words

Delineate

  • to identify a break in the meaning – to separate, to outline
    • ex: spaces delineate words so I can parse text into words

Organization When Coding

  • start with an outline, big picture
  • then fill in functions with their definitions
  • start with brain’s way of understanding, then translate into code
    • English -> psuedo-code -> code

Recursion

  • when you take in some input, get output, and put that output back thru (however many times)

F(x) = 2x

F(F(x)) = 2(2x) = 4x

  • in this example, both xs are local – they are not the same but are distinct from each other
  • a recursion example: a function taking in itself (tricky b/c of scope and accidental infinite loop)

Today’s Resources:

My older brother.

CS50 — Week One Notes

CS50 — Week One Notes

From Scratch to C.

  • same elements, different language to implement
  • write a bit of code, run/test it, then move on once its debugged (vs trying to write all your code at once and then not really knowing which part is making the whole thing crash, etc.)
  • any time I’m copying and pasting code from one place to another without changing a thing, there’s probably a better way to do it (a cleaner design)
  • source code –> pass as input to a compiler –> compiler turns it into source code that computer can read
    • “compiling” = umbrella term
    • preprocessing
      • find and (re)place
      • ex. “#include”s in C are in this step
    • compiling
      • source code –> assembly code
      • assembly code = arcanely worded sets of instructions that sets of 0s and 1s are mapped to
    • assembling
      • assembly code –> 0s and 1s
    • linking
      • computer gets the 0s and 1s from the code I wrote in my .c file, CS50 staff’s 0s and 1s from what they wrote in the CS50 library, printf()’s 0s and 1s from the standard input/output library, etc. (whatever was used to write my program) and combines together in my program “hello” (or w/e the name is)

The imprecision of number representation due to a computer’s limitations

  • we humans can think of and create an infinite number of numbers
  • computers are limited by how much memory’s available in their RAM
    • think about it – without any tricks or extra code, a byte can only store integers of value 0 thru 255!
  • we humans need to be aware of this limitation
    • ex: when The Patriot anti-air missile system was programmed to count time by 1/3 of a second, and so after running for many hours straight its timing was off and lives were lost
  • by creating a simple program in C, we can see that the computer’s representation of the decimal created by dividing 1 from 10 (should be 0.10) proves itself to be fundamentally imprecise when taken to too many decimal points
    • printf(“n is %.55f”, 1.0 / 10.0); yields “n is 0.1000000000000000055511151231257827021181583404541015625”

The Linux Command Line

  • Mac OS is a descendent of the Unix operating system (“a Unix-based system”); Linux was developed independently but is similar to Unix; Windows was built from ground up in a completely different way
  • the default command line in the CS50 IDE is terminal called “workspace”
  • shell = software interface (often a command line interface) that allows users to interact with the computer
  • bash = “Bourne-Again Shell”, the default shell environment in Linux and Mac OS X

name of command <argument>

ls

  • short for “list”, gives readout of all files and folders in current directory
    • blue = directory (that I can navigate into)
    • black = text or source code file
    • green = executable file

cd <directory>

  • short for “change directory”, changes current directory to (a specified)<directory> in my workspace or on my operating system
    • shorthand for current directory = .
    • shorthand for parent directory of current directory = . .
    • pwd (present working directory) gives name of current directory
    • cd on its own will bring me back to original directory location
  • if the name of a directory has multiple words separated by spaces, must enclose name in quotes when calling it up

mkdir <directory>

  • short for “make directory”, creates a new subdirectory called <directory> located in the current directory
  • analogous to: in GUI operating system, right click and in prompt click “New Folder”

cp <source> <destination>

  • short for “copy”, creates duplicate of file I specify as <source>, which it saves in <destination>
  • cp -r <source directory> <destination directory>
    • copies entire directories (“-r” flag stands for recursive, tells cp to copy everything inside directory, including any subdirectories it may contain)

rm <file>

  • short for “remove”, deletes <file> after asking me to confirm (y/n)
  • rm -f <file>
    • skips the confirmation (but can’t undo so beware)
  • rm -r <directory>
    • deletes entire directory (can combine with -f flag: “rm -rf <directory>”)

mv <source> <destination>

  • short for “move”, renames file, moving it from <source> to <destination>

chmod <options> <permissions> <file>

  • short for “change mode”, changes the permissions of files or directories (aka who can access the file and how they can access it)
  • options are like the -f flag we saw before, plus others
  • permissions can be defined for:
    • user (owner of file)
    • group (members who own the file with you)
    • others (anyone else)
  • permissions can include:
    • read (4), write (2), execute (1), no permission (0)
  • example in symbolic permissions notation:

    chmod u=rwx, g=rx, o=r myfile

  • same example in octal permissions notation:

    chmod 754 myfile

ln <target> <linkname>

  • short for “link”, creates links between files
  • if <linkname> is omitted, link to <target> is created in current directory using the name of <target>
  • different from cp – same data is being pointed to by both file names
    • if go into data and change it from either, the data is changed for both
    • if one of the files is deleted, data is still safe in other file which points to it
  • use the tag -s to create symbolic links (file points to other file/directory where data/files are stored)

touch <file>

  • changes file timestamps (access, modification times) to current system time

rmdir <directory>

  • short for “remove directory”, removes directories that are empty

man <argument>

  • short for “manual”, shows the system’s reference manual for that argument (usually a program, utility or function)

diff <file> <file>

  • short for “different”, analyzes two files and outputs instructions to change 1st file to match the second

sudo <command>

  • short for “superuser do”, allows a permitted user to execute command as another user

clear

  • clears the screen
  • same as ctrl + l in bash

telnet

  • short for “TELNET protocol”, used for communication with a remote host or device
    • ex: “telnet hostname” would connect user to a host named hostname
    • unsafe to transfer data in this way because others could grab data as it’s being transmitted

Data Types

  • in C, an older language, we have to specify each data type before we can work with it

Native to C:

int

  • used for variables that will store integers
  • integers always take up 4 bytes (32 bits) of memory
    • roughly, that means we can store -2 billion to 2 billion
  • unsigned int
    • unsigned = qualifier that doubles positive range at the expense of disallowing negative values
      • other qualifiers = long, short, const
    • if I know my value will never be negative, I can use this to count up to ~4 billion

char

  • used for variables that will store single characters
  • characters always take up 1 byte (8 bits) of memory
    • thanks to the ASCII system, different numbers have been assigned to characters (ex: “0” = 48)

float

  • used for variables that will store floating-point values (real numbers)
  • floating point values always take up 4 bytes (32 bits) of memory
    • some of these bits are used for the integer part and some for the decimal portion
    • we are limited to how precise we can be

double

  • used for variables that store floating-point values
  • double precision -> always take up 8 bytes (64 bits) of memory

void

  • a type, but not a data type (can’t assign a value to it)
  • void functions don’t return a value
  • a function’s parameter list can also be void if the function takes no parameters
    • like “main”, when we haven’t been passing any arguments into main (we could tho)

Thanks to the CS50 Library:

bool

  • used for variables that store a Boolean value (true or false)
  • booleans are a standard default (native) data type in many modern programming languages, but not in C

string

  • used for variables that will store a series of characters (a “string”)
  • can be words, sentences, paragraphs or even books!
  • not a native data type in C, so doesn’t come up as purple in the IDE

And later on…

structs

  • used to group ints and strings into one unit

typedefs

  • “defined types” that allow me to create my own data types

Variables

  • to create (“declare”), specify the data type, give it a name, and slap a semicolon to the end
  • to create multiple variables of same type, specify the type once and then list as many variables as desired
    • int height, width;
      • creates two integer type variables with respective names “height” and “width”
  • good practice = create variable right when I need it
  • after a variable’s been declared, no need to specify variable type again
  • simultaneously declaring and setting the value of a variable? (“initializing”)
    • int price = 17; // we’ve specified an integer, named the variable “price” and assigned it a value of 17

Operators

  • in C, we can add, subtract, multiply and divide numbers, as well as get the remainder of left number divided by right (modulus)
    • int m = 13 % 4; // m is now 1 (13 / 4 = 3.25 or 3 1/4)
    • the modulus operator is more useful than you’d think (like in a random number generator, whereby you want to limit the range of the random number you generate)
  • shorthands
    • x = x * 5 –> x *= 5; // works with all 5 basic operators
    • x++; // incrementing variable by 1
    • x–; // decrementing variable by 1
      • x = x + 1 == x += 1 == x++

boolean expressions

  • in C, every nonzero value = true and every zero = false
  • don’t always have to use type bool when we are working with boolean expressions
  • logical operators
    • logical AND (&&) is true if and only if both operands are true, otherwise false
      • if x && y {..go down this path..}
      • x is true and y is true? true
      • x is true but y is false? false …etc.
    • logical OR (||) is true if and only if at least one operand is true, otherwise false
      • if x || y {…}
    • logical NOT (!) inverts the value of its operand (called “bang”, “exclamation” and “not”)
      • x is true? !x is false (and vice versa)
  • relational operators
    • less than (x < y)
    • less than or equal to (x <= y)
    • greater than (x > y)
    • greater than or equal to (x >= y)
    • equality (x == y)
    • inequality (x != y)

 

basics of higher-level software programming languages

basics of higher-level software programming languages

Notes:

Computers are linear – they are reading top to bottom, left to right. If you want to make two things happen at once, you are creating the illusion that the computer is doing this. (But it’s not, it’s doing one thing at a time still.)

Everything’s possible with programming, but it takes some cleverness to accomplish it.

Everything has a position – the file, the program, etc.

Frames: everything you want to have happen “at the same time” has to happen within the same frame – calculate a little, show a little, calculate a little, show a little.

Anything that happens first tells the computer what’s coming next. (For example, in CSS the “.” before “.thick-outline” tells me that it’s going to be read by the computer as a class, and not something else.)

Always think about everything possible that people could do. If you have some data they can input, think about all the different data they could give you, even outside your expectations (hackers use data exceptions to break programs, for example).

If you ask the computer, what data is in that position, and that position does not exist, it’ll be “null”.

Array = database without a pointer, instead it just has numerical names to identify the location of various elements. (No “next line”, just “data piece 4” or “data piece 7”.)

Native data types = what’s already there. If it’s not a native data type (like strings in C), it’s going to take an extra cleverness to make it work.

Because the computer reads left to right, there’s precedence. Whatever comes first sets the stage. For example, it quotes tell the computer there’s text coming, and then I type “//”, this will be shown as text and not create a comment.

In C, “;” is not needed at the end of pre-processor directives (#blah input library blah), comments, function definitions, and controllers.

In C, you must start with the function “main” (like how you have to put everything you want the user to see inside the <body> tag in HTML).


4 basics:

  1. comments

  2. data

  3. functions

  4. variables/names

comments

  • text not read by the compiler or the interpreter
  • // – line comment (end comment by going to new line)
  • /* – block comment (end with */)
  • for me, for other people looking at my code, and for remembering things that for the moment aren’t visible (I create some variable but what am I using it for? etc.)

data

  • number types
    • boolean
      • 0 or 1 – one binary digit
      • for example: “true” returns a boolean 1
      • use it every time you can
    • integers (C: “int”)
      • shorts, longs
      • if you can stick with integers, do! otherwise it’s a big mess
    • floating points/floats
      • decimals are infinite! (and therefore can get very messy)
  • character types (C: “char”)
    • strings = a string of characters/a blip of text
      • in C, there’s support for how you can make your own strings, but it’s not it’s own type like it is in C++
        • one way: make an array of letters
        • a bunch of things stored in your memory, accessed thru pointers telling your computer where in the memory to look for that piece of data
        • can use functions like “printf”
  • pointer types
    • every type of data is really just a position on the memory that your computer is accessing
    • the computer does not hold anything in its hands – it goes line by line and accesses the memory
    • even RAM is used in the same way, where the computer has to call it up from memory to access it, but it’s just a smaller amount and therefore easier/faster for the computer to get through it
    • pointers tell the computer where to look in the memory

functions

  • basic, native functions
    • aka the actual building blocks
      • examples: +, int, return, true, const char, etc. (everything pink in Xcode)
      • “int” locates, sets up memory space that’s the right size to store integers
  • library/included functions
    • a group of other people’s functions (built off native functions)
    • problem could be that I don’t know how it works, since someone else made it
      • documentation (description, examples, source code) or testing can help
  • custom functions
    • my own (also built off native functions)
  • functions work with data
    • functions that work with integers only, for example
  • 3 things functions do:
    • input data into the function from the code (either actual user input I type in and can see or from variables I’ve set up with some stored memory of some data) = parameters
      • a function can have no parameters, many (a lot of things to input), etc.
    • output data into the code = return
      • for example, if I am using the function “+” and I input the data (3, 3) the output will be (6)
        • if I set a variable y = 3+3, then y will have a value of 6 anywhere it appears in my code
      • a function can only have one return -> need many returns? use multiple functions
      • the “return” function helps other functions return
    • do
      • changing a pixel in a game, putting a command prompt on the screen, making a sound
      • many functions could come first and pass data around before you ever come to a “do” that the user can see
      • can be a do function just on its own, for example “delay” that makes the program wait
      • code pointer movement functions, just moving code pointers around, like “if” “then” “loop” etc. are do
      • you can’t program a functionality – every function with a “do” is native
    • some functions will do all 3, many will have parameters and then do something or give a return
    • syntax of a function: name (parameters)
    • most of the time, my functions will call other functions – functions inside functions
      • subdividing functions into smaller little pieces is useful because then I can reuse them other places – allows for flexibility & allows me to work on smaller tasks at a time for more gratification/feeling of progress
    • when you use a function, it’s a “call
    • before you ever use it, you need to define a function (tell the computer what to do)

variables

  • there’ll be many custom things I’ll make – functions, variables, arrays & structs of different types
    • variables = holding data directly
    • arrays = bunch of different data each with a number to identify it
    • structs = multiple pieces of data held in a single location (using different sub-names to access them within that location)
  • start by declaring (its type), then space and the name of the variable
    • ex: “int variable_name”

Today’s resources:

A 4-hour FaceTime with my older brother.

Aside

viking school’s “How a Website is Built” notes

An example workflow to make a simple blog for a group of 3 friends.

The workflow =

discuss user needs with the client ->

build a mockup & check with client for satisfaction ->

break the mockup into bite-sized stories

story by story, do the following:

->-> determine, set up data architecture

 ->-> build back end

->-> build front end

-> refactor (restructure existing code w/out changing its external behavior), iterate (repeat) & ship (will learn more about this later) site to production

  • he uses Balsamiq to create mockups
  • when creating a story, think about who, what and why
  • in our example, we have 3 stories:
    • As an [author], I want to [publish posts on the site] so that I can [put my thoughts on the web]
    • As a [reader], I want to [easily view the most recent posts] so that I can [stay up to date]
    • As a [reader], I want to [write a comment on that post] so that I can [offer my thoughts on a post]
  • next step is to prioritize which stories to tackle first
  • he uses PivotalTracker to track stories/his progress
  • once selected the story to work on, determine data needs
    • on a whiteboard, pencil/paper etc. – diagram out what you’re going to have to work thru to start planning how you’ll implement it
  • work thru backend->frontend
  • check site/debug throughout
  • refactor (he says his code isn’t good and should be cleaned up – I’m curious as to what this actually means)
  • present to client
more more more

more more more

Getting right back to it:

Phishing is when someone masquerades as someone else (often with a fake website) to trick people into sharing their personal information.

Malware = malicious software (usually installed without your knowledge). Malware can be in the form of software masquerading as good (like an anti-virus program that’s really a virus) or it can install itself without asking because of some site you visited. Once on your computer, malware can access your data and files and do whatever with them. Eek!

URL = uniform resource locator, consisting of the protocol (like “http://&#8221;), the domain (like “google.com”) and the path (like “/internet.html”)

◊ The web = the part of the Internet that uses HTTP (hypertext transfer protocol). Which leads me to ask – what are the other parts of the Internet and what are they used for?

◊ A website = a collection of webpages (documents written in HTML) on the same domain.

◊ A browser = a program that can open webpages and display them (by interpreting the source code).

◊ Basically, everything that an everyday user interacts with is a program. 知らなかったよ。

◊ Again, take this as you will (it’s coming from a complete beginner), but I did not previously know that you can “inspect”, aka see the HTML code, of basically any website you’re on. That’s cray-cray!

◊ And one point from actual coding I did (ha): In CSS, classes are marked with a “.” and ids are marked with a “#”. Both of them are more specific than the generic element styling (body, p, h1, h2, li, etc.). However, the rule is only 1 id per page – unlike classes, which you can add to as many elements as you like.

Today’s resources:

https://bento.io/topic/web

https://www.youtube.com/channel/UCVTlvUkGslCV_h-nSAId8Sw (LearnCode.academy’s YouTube channel)

While I’m by no means making crazy leaps and bounds here in my understanding of HTML and CSS, I am starting to wonder why I don’t just try building my own blog layout. Like, from scratch. Instead of this template.