Think of objects as namespace dictionaries: keys are strings, values are functions or integers or other objects, anything that Python can deal with.
Objects look like this:
class foo: # Constructor def __init__(self, value=3): self.value = value self.count = 3 def update(self, newvalue): self.value = newvalue self.count += 1 def num_updates(self): return self.count def get_value(self): return self.value foo1 = foo() foo2 = foo(5) foo1.update(7) foo1.update(11) print "foo1: %d %d" % (foo1.get_value(), foo1.num_updates()) print "foo2: %d %d" % (foo2.get_value(), foo2.num_updates())
Notice the default argument for __init__. What will this code snippet print?
Everything is public, but there are strong conventions for using methods whenever possible. For example, in our last example, we could write:
foo1.value = 13but this would be bad. It wouldn't have the same side effects that foo.update would. You can signal to other programmers that something is a really bad idea for them to peek or poke (a private implementation detail, for example) by prefacing its name with an underscore:
class Payment: def __init__(self, credit_card_number): self._ccn = credit_card_number
You can turn the functional syntax inside out to see how Python actually behaves behind the scenes:
item = foo() foo.update( # Call the class method directly item, # on a class instance -- this becomes "self" 5) # and an integer -- this becomes "value"which is exactly equivalent to:
item = foo() item.update( # Call the attribute of "item" called "update" (which is a # reference to foo.update) and automatically pass "item" as the # first argument... 5) # on a value, which becomes the second argument.
Class methods need "self". Not mandatory to call it "self", but it's strongly advised, so other people will understand your code.
What happens if you forget "self"?
Suppose we were librarians and wanted to define a class hierarchy to keep our books in. We might start with something like this:
class Book: def __init__(self, title, author): self.title = title self.author = author def info(self): return "%s by %s" % (self.title, self.author)Now we realize we want a special kind of book to distinguish illustrated children's books from regular books. We can derive a subclass, like this:
class ChildrensBook(Book): def __init__(self, title, author, illustrator): Book.__init__(self, title, author) self.illustrator = illustrator def info(self): bookinfo = Book.info(self) return "%s, illustrated by %s" % (bookinfo, self.illustrator)What happens when we do this? Think about the call stack.
book = ChildrensBook("Foo", "Author", "Illustrator") print book.info()
An interesting trick you can do is store a method reference and call it later:
bi = book.info print bi() # 'Foo by Author, illustrated by Illustrator' book = None print bi() # 'Foo by Author, illustrated by Illustrator'Here we've thrown away the reference to the ChildrensBook object itself, but since we still have this reference to one of its methods, the method call hangs onto the object for us. (This probably isn't something you need to know, but thinking about how you would implement this kind of feature could give you some insight into how programming languages are implemented.)
Operator overloading uses special names in Python: __add__, __mul__, and so forth. Here's a fairly complete Fraction class which uses this:
class Fraction: def __init__(self, num, denom): self.n, self.d = int(num), int(denom) def __add__(self, other): if isinstance(other, Fraction): return Fraction(self.n * other.d + other.n * self.d, self.d * other.d).reduced() if isinstance(other, int): return Fraction(self.n + self.d * other, self.d).reduced() raise TypeError("Cannot add to that type") __radd__ = __add__ def __sub__(self, other): if isinstance(other, Fraction): return Fraction(self.n * other.d - other.n * self.d, self.d * other.d).reduced() if isinstance(other, int): return Fraction(self.n - self.d * other, self.d).reduced() raise TypeError("Cannot subtract from that type") def __mul__(self, other): if isinstance(other, Fraction): return Fraction(self.n * other.n, self.d * other.d).reduced() if isinstance(other, int): return Fraction(self.n * other, self.d).reduced() raise TypeError("Cannot multiply by that type") def __div__(self, other): if isinstance(other, Fraction): return Fraction(self.n * other.d, self.d * other.n).reduced() return NotImplemented def __rdiv__(self, other): if isinstance(other, int): return Fraction(self.d * other, self.n).reduced() return NotImplemented def __cmp__(self, other): if isinstance(other, Fraction): return (self.n * other.d).__cmp__(self.d * other.n) if isinstance(other, int): return other.__cmp__(1 + (self.n - 1) / self.d) raise TypeError("Cannot compare Fraction to that type") def ipart(self): return self.n / self.d def reduced(self): n, d = self.n, self.d f = gcd(n, d) self.n, self.d = n // f, d // f return self def reduce(self): self = self.reduced() def __str__(self): return "Fraction(%d / %d) = %s" % (self.n, self.d, str(decimal.Decimal(self.n) / decimal.Decimal(self.d)))
There's a grammar to regular expressions, and different characters mean different things in different places. Keep this in mind.
Python's regular expressions (regex after this) are kept in the 're' module:
import re
Python's string handling can get in your way a bit when defining a regex pattern. For example, the backslash character '\' acts as a quoting character both in regex and in strings. So suppose you wanted a regex which looks for a single backslash. Because \ is a special character in strings, you have to escape it:
print '\\' \So let's try to match this pattern against something:
re.search('\\', 'word\\anotherword') Traceback (most recent call last): File "What happened? re thinks a backslash is an escape character, too, so we passed it an escape character with nothing after it, which is bogus. We'd need to escape the backslashes twice to avoid this:", line 1, in File "/usr/lib/python2.6/re.py", line 142, in search return _compile(pattern, flags).search(string) File "/usr/lib/python2.6/re.py", line 245, in _compile raise error, v # invalid expression sre_constants.error: bogus escape (end of line)
re.search('(\\\\)', 'word\\anotherword').groups() # ('\\',)
So to prevent us from writing a zillion backslashes, we can tell Python to use 'raw strings', which don't treat \ like a special character. This happens when you put the letter r in front of a string constant:
re.search(r'(\\)', 'word\\anotherword').groups() # here ^ # ('\\',)This makes regex a lot easier on the eyes, don't forget it.
Regex has lots of metacharacters, which prove very useful.
. | Dot means any character (except a newline, unless you specify otherwise; see the documentation). |
^ | Caret means the start of a string. |
$ | Dollar means the end of a string. |
* | Star means repeat the previous thing zero or more times. For example, r'Go*gle' matches 'Ggle', 'Gogle', 'Google', and 'Goooooooooooogle'. |
+ | Plus means repeat the previous thing one or more times. For example, r'Go+gle' matches 'Gogle', 'Google', and 'Goooooooooooogle', but not 'Ggle'. |
() | Parens take multiple things and builds a larger unit out of them. For example, r'ab*' matches 'abbbbbbb' but not 'ababab', but r'(ab)*' matches 'ababab' but not 'abbbbbb' or 'aaaaab'. |
{m} | Curly braces with one number in them requires exactly that many copies of the previous thing. For example, r'Go{2}gle' matches only 'Google'. |
[] | Square brackets build character classes; anything in a character class matches. For example, if you wanted to match the words "pool", "poel", "peol", and "peel", you could write r'p[oe]{2}l'. (Why does this match 'peol' and 'poel'?) You can also invert character classes by putting '^' at the beginning, so r'[^aeiou]' matches any character but a vowel. There are some predefined character classes, like '\s' (whitespace, including tabs and spaces) and \d (digits, 0 through 9). |
What does this regex match?
r'^\s*[^#].*$'Let's walk through it.
Can someone give a more concise description of what this does?
body>