Saturday, November 15, 2008

Repeated Squaring

http://www.algorithmist.com/index.php/Repeated_Squaring

Repeated squaring, or repeated doubling is an algorithm that computes integer powers of a number quickly. The general problem is to compute xy for an arbitrary integer y.

 

// Calculates n to the p power, where p is a positive number.
func power( var n as integer, var p as integer )
if p = 0 return 1
if p = 1 return n
if p is odd
return n * power( n * n, (p-1) / 2 )
else
return power( n * n, p / 2 )
end func

Tuesday, November 11, 2008

quadtree

http://en.wikipedia.org/wiki/Quadtree

UVa problem: 297: Quadtrees

refer to forum: http://acm.uva.es/board/viewtopic.php?f=2&t=700&p=2691&hilit=297&sid=156699fafa75aaeacf281e1af3a8006c#p2691

#include <iostream>

#define M 32
int p[M];
int idx;

int mask[] = {
0, 0x1, 0x3, 0, // 0-3
0xf, 0, 0, 0, 0xff, //4-8
0,0,0,0,0,0,0,0xffff, // 16
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0xffffffff // 32
};

#ifndef ONLINE_JUDGE
// couting 1's
int count_bit(int n) {
int bits = 0;
while(n) {
n &= n-1;
++bits;
}
return bits;
}
#endif

void color (char *str, int size, int x, int y) {
if (str[idx] == 'p') {
idx ++;
int sz = size>>1;
color(str, sz, x+sz, y);
color(str, sz, x, y);
color(str, sz, x, y+sz);
color(str, sz, x+sz, y+sz);
}
else if (str[idx] == 'f') {
idx ++;
for (int i = y; i < y + size; i ++)
p[i] |= mask[size] << x;
}
else idx++;
}
int main () {
int num;
char tree1[1500], tree2[1500];

scanf("%d\n", &num);
while(num--) {
scanf("%s\n%s\n", tree1, tree2);

memset(p, 0, sizeof(p));

idx=0;
color(tree1, M, 0, 0);

idx=0;
color(tree2, M, 0, 0);

int total = 0;
for (int i = 0; i < M; i ++)
#ifndef ONLINE_JUDGE
total += count_bit(p[i]);
#else
total += __builtin_popcount(p[i]);
#endif

printf("There are %d black pixels.\n", total);
}
return 0;
}
UVa problem: 752 - Unscrambling Images
#include <iostream>
#include <deque>
using namespace std;
int q[16][16];
int e[16][16]; // encrypted

int d[16][16]; // decoded
inline int getxy(int size, int k, int &x, int &y) {

deque<int> Q;
while(k > 0) {
int r = k % 4;
if (r == 0) {
r = 4;
k = (k-4) >> 2;
} else
k = k >>2;

Q.push_front(r);
}

x = 0, y = 0;
for(int i = 0; i < Q.size(); ++i) {
size = size >> 1;
int idx = Q[i];

if (idx == 2)
y += size;
if (idx == 3)
x += size;
if (idx == 4) {
x += size;
y += size;
}
}
return size;
}

int main() {
int ndata;
scanf("%d ", &ndata);
int nd = 1;
while(nd <= ndata) {
if(nd > 1) printf("\n");
printf("Case %d\n\n", nd++);
int n;
scanf("%d ", &n);

// quadtree reset to all 0
memset(q, 0, sizeof(q));

int m; // number of leaf nodes

scanf("%d ", &m);
int k;
int intensity;
for(int i = 0; i < m; ++ i) {
scanf("%d %d ", &k, &intensity);
int x, y;
int size = getxy(n, k, x, y);

for(int i = x; i < x+size; ++i)
for(int j = y;j < y+size; ++j)
q[i][j] = intensity;
}
/*
for(int i = 0; i < n; ++i) {
for(int j = 0; j < n; ++j)
printf("%4d", q[i][j]);
printf("\n");
}
printf("\n");
*/

scanf("%d ", &m);

// encrypted quadtree reset to all 0
memset(e, 0, sizeof(e));
for(int i = 0; i < m; ++ i) {
scanf("%d %d ", &k, &intensity);
int x, y;
int size = getxy(n, k, x, y);

for(int i = x; i < x+size; ++i)
for(int j = y;j < y+size; ++j)
e[i][j] = intensity;
}

for(int i = 0; i < n; ++i) {
for(int j = 0; j < n; ++j) {
int x = q[i][j] / n;
int y = q[i][j] % n;
d[x][y] = e[i][j];
}
}
for(int i = 0; i < n; ++i) {
for(int j = 0; j < n; ++j)
printf("%4d", d[i][j]);
printf("\n");
}
}
}

Wednesday, November 05, 2008

Add without +

Write a program to add two numbers a and b.
1. Without Using + operator
2. Without using any loops

Sum = A XOR B
Carry = A AND B

#include <iostream>

int add(int p, int q)
{
if(q == 0)
return p;
int k, r;
k = p ^ q;
r = p & q;
return add(k, r<<1); // carry left shift 1 bit
}

int main() {
printf("%d\n", add(4,7));
printf("%d\n", add(401,7));
printf("%d\n", add(14,7));
printf("%d\n", add(4,47));
printf("%d\n", add(4,-97));
printf("%d\n", add(-4,-97));
}


output:



11
408
21
51
-93
-101

C++ notes II

Segmentation fault:  A segmentation fault occurs when a program attempts to access a memory location that it is not allowed to access, or attempts to access a memory location in a way that is not allowed. Segmentation is one approach to memory management and protection in the operating system.

Stack Overflow: Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than there was actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly.

A core dump is the recorded state of the working memory of a computer program at a specific time, generally when the program has terminated abnormally (crashed). In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information.

Static function:

The differences between a static member function and non-static member functions are as follows.

  • A static member function can access only static member data, static member functions and data and functions outside the class. A non-static member function can access all of the above including the static data member.
  • A static member function can be called, even when a class is not instantiated, a non-static member function can be called only after instantiating the class as an object.
  • A static member function cannot be declared virtual, whereas a non-static member functions can be declared as virtual
  • A static member function cannot have access to the 'this' pointer of the class.

The static member functions are not used very frequently in programs. But nevertheless, they become useful whenever we need to have functions which are accessible even when the class is not instantiated.

A memory leak is a particular type of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed. This condition is normally the result of a bug in a program that prevents it from freeing up memory that it no longer needs.

http://linux.die.net/man/2/fork

fork() creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pending signals are not inherited.

Under Linux, fork() is implemented using copy-on-write pages, so the only penalty that it incurs is the time and memory required to duplicate the parent's page tables, and to create a unique task structure for the child.

Return Value

On success, the PID of the child process is returned in the parent's thread of execution, and a 0 is returned in the child's thread of execution. On failure, a -1 will be returned in the parent's context, no child process will be created, and errno will be set appropriately.

Shallow copies and deep copies

A shallow copy of an object copies all of the member field values. This works well if the fields are values, but may not be what you want for fields that point to dynamically allocated memory. The pointer will be copied. but the memory it points to will not be copied -- the field in both the original object and the copy will then point to the same dynamically allocated memory, which is not usually what you want. The default copy constructor and assignment operator make shallow copies.

A deep copy copies all fields, and makes copies of dynamically allocated memory pointed to by the fields. To make a deep copy, you must write a copy constructor and overload the assignment operator, otherwise the copy will point to the original, with disasterous consequences.

Deep copies need ...

If an object has pointers to dynamically allocated memory, and the dynamically allocated memory needs to be copied when the original object is copied, then a deep copy is required.

A class that requires deep copies generally needs:

Defunct Process:  Defunct processes are corrupted processes that can no longer communicate between the parent and child one. Sometimes, they become “zombies” and remain in your system until you reboot your machine. You can try to apply “kill -9″ command, but most of the time you’ll be out of luck.

Stack unwinding:

When an exception is thrown and control passes from a try block to a handler, the C++ run time calls destructors for all automatic objects constructed since the beginning of the try block. This process is called stack unwinding. The automatic objects are destroyed in reverse order of their construction. (Automatic objects are local objects that have been declared auto or register, or not declared static or extern. An automatic object x is deleted whenever the program exits the block in which x is declared.)

If an exception is thrown during construction of an object consisting of subobjects or array elements, destructors are only called for those subobjects or array elements successfully constructed before the exception was thrown. A destructor for a local static object will only be called if the object was successfully constructed.

http://en.wikipedia.org/wiki/Data_segment

In the PC architecture there are four basic read-write memory regions in a program: Stack, Data, BSS, and Heap.

  1. The Data segment contains constants used by the program that are not initialized to zero. For instance the string defined by char s[] = "hello world"; in C would exist in the data part.
  2. The BSS segment starts at the end of the data segment and contains all global variables that are initialized to zero. For instance a variable declared static int i; would be contained in the BSS segment.
  3. The heap area begins at the end of the data segment and grows to larger addresses from there. The Heap area is managed by malloc, realloc, and free, which use the brk and sbrk system calls to adjust its size. The Heap area is shared by all shared libraries and dynamic load modules in a process.
  4. The stack is a LIFO structure, typically located in the higher parts of memory. It usually "grows down" with every register, immediate value or stack frame being added to it. A stack frame consists at minimum of a return address.

Virtual Table The virtual table is a lookup table of functions used to resolve function calls in a dynamic/late binding manner.

A Simple Garbage Collector for C++  http://www.devarticles.com/c/a/Cplusplus/A-Simple-Garbage-Collector-for-C-plus-plus/

A deadlock is a situation wherein two or more competing actions are waiting for the other to finish, and thus neither ever does.

In scheduling, priority inversion is the scenario where a low priority task holds a shared resource that is required by a high priority task. This causes the execution of the high priority task to be blocked until the low priority task has released the resource, effectively "inverting" the relative priorities of the two tasks. If some other medium priority task, one that does not depend on the shared resource, attempts to run in the interim, it will take precedence over both the low priority task and the high priority task.

3Sum problem

http://discuss.techinterview.org/default.asp?interview.11.613896.3

http://en.wikipedia.org/wiki/3SUM

Given an unsorted array of integers. Find any three numbers satisfying following condition:
c = a+b

First, sort the array in O(nlogn).
For each element c of the array try to find two numbers a and b such that a+b = c.
Time complexity:
O(n) to find a+b = c in sorted array.  (We can use two pointers and move them towards the middle)

Since we do this for each element. O(n^2).

Monday, November 03, 2008

Maximun Flow

Residual network: (credit: http://www.topcoder.com/tc?module=Static&d1=tutorials&d2=maxFlow)

  • if the flow along the edge x-y is less than the capacity there is a forward edge x-y with a capacity equal to the difference between the capacity and the flow (this is called the residual capacity),
  • if the flow is positive there is a backward edge y-x with a capacity equal to the flow on x-y.

image image

Augmenting path: a path from the source to the sink in the residual network, whose purpose is to increase the flow in the original one.

Ford-Fulkerson method: start with no flow everywhere and increase the total flow in the network while there is an augmenting path from the source to the sink with no full forward edges or empty backward edges - a path in the residual network.

image

A cut in a flow network is simply a partition of the vertices in two sets, let's call them A and B, in such a way that the source vertex is in A and the sink is in B.

The capacity of a cut is the sum of the capacities of the edges that go from a vertex in A to a vertex in B.

The flow of the cut is the difference of the flows that go from A to B (the sum of the flows along the edges that have the starting point in A and the ending point in B), respectively from B to A, which is exactly the value of the flow in the network, due to the entering flow equals leaving flow - property, which is true for every vertex other than the source and the sink.

The flow of the cut is less or equal to the capacity of the cut due to the constraint of the flow being less or equal to the capacity of every edge.

This implies that the maximum flow is less or equal to every cut of the network. This is where the max-flow min-cut theorem comes in and states that the value of the maximum flow through the network is exactly the value of the minimum cut of the network.

Maximum Bipartite Matching

http://shygypsy.com/tools/bpm.cpp   (a bpm code)

UVa Problems:

  • 10080: Gopher (II)  (跟670几乎一样,先确定graph[][]即可,用scanf("%lf", ..) 和eps比较double型)
  • 10092: The Problem with the Problem Setter  (L: the number of problems that should be included; R: the total number of problems in the pool)
  • 259:   Software Allocation (每台机子只能run一个application, 又是典型的0-1 assignment)
  • 670:   The dog task

其实就是先填一个graph[][]即可,在每两个master点之间,确定dog能到达哪些interesting places. 然后就是一个0-1 assignment问题而已。另注意double型比较用了eps = 1e-18(开始用1e-15 WA: )

special case:

2 1
0 0 2 2
3 3

#include <iostream>
#include <vector>
#include <cmath>
using namespace std;

#define MAX 101
#define eps 1e-18
typedef pair<int,int> ii;

bool graph[MAX][MAX];
bool seen[MAX];
int matchL[MAX], matchR[MAX];
// n: dim of L; m: dim of R
int n, m;

bool bpm( int u ) {
for( int v = 0; v < m; v++ ) if( graph[u][v] )
{
if( seen[v] ) continue;
seen[v] = true;

if( matchR[v] < 0 || bpm( matchR[v] ) )
{
matchL[u] = v;
matchR[v] = u;
return true;
}
}
return false;
}

double dist(ii a, ii b) {
return sqrt((a.first - b.first) * (a.first - b.first) +
(a.second - b.second) * (a.second - b.second) + 0.0);
}

int main() {
int ndata;
scanf("%d ", &ndata);
while(ndata--) {
scanf("%d %d ", &n, &m);

vector<ii> master(n);
vector<ii> place(m);
for(int i = 0; i < n; ++i)
scanf("%d %d ", &master[i].first, &master[i].second);

for(int i = 0; i < m; ++i)
scanf("%d %d ", &place[i].first, &place[i].second);

memset(graph, false, sizeof(graph));
memset( matchL, -1, sizeof( matchL ) );
memset( matchR, -1, sizeof( matchR ) );

for(int i = 0; i < n-1; ++i) {

// master's walking distance doubled
double d2 = dist(master[i], master[i+1]) * 2;
for(int j = 0; j < m; ++j) {
// dog's walking distance
double dogdist = dist(master[i], place[j]) + dist(place[j], master[i+1]);
// as long as dog's walking distance is no more than 2xmaster's, the dog can visit this place
if(d2 - dogdist>= eps)
graph[i][j] = true;
}
}

int cnt = 0;
for( int i = 0; i < n; ++ i){
memset(seen, 0, sizeof( seen ) );
if(bpm(i)) cnt++;
}

printf("%d\n", n+cnt);
printf("%d %d", master[0].first, master[0].second);
for(int i = 0; i < n; ++i) {
if(i) printf(" %d %d", master[i].first, master[i].second);
if(matchL[i] != -1) {
int p = matchL[i];
printf(" %d %d", place[p].first, place[p].second);
}
}
printf("\n");
if(ndata) printf("\n");
}
}


这题难点在于要输出unique的matching, 我是run max bipartite matching 算法 |E|次(the number of edges), 每次检查uniqueness


#include <iostream>
#include <vector>
using namespace std;

struct Rect {
int xmin;
int xmax;
int ymin;
int ymax;
};

#define MAX 27 // the number of alphabet
bool graph[MAX][MAX];
bool seen[MAX];
int matchL[MAX], matchR[MAX];
int n;
Rect slides[MAX];

bool exist; // if solution already exists
int savedR[MAX]; // save previous solution

// if a points is inside a rectangle
inline bool ptInRect(int x, int y, Rect r) {
//No number will lie on a slide boundary.
if (x > r.xmin && x < r.xmax && y > r.ymin && y < r.ymax) return true;
return false;
}

bool bpm( int u ) {
for( int v = 0; v < n; v++ ) if( graph[u][v] )
{
if( seen[v] ) continue;
seen[v] = true;

if( matchR[v] < 0 || bpm( matchR[v] ) )
{
matchL[u] = v;
matchR[v] = u;
return true;
}
}
return false;
}

int main() {
int ndata = 0;
while(scanf("%d ", &n) && n) {
printf("Heap %d\n", ++ndata);

for(int i = 0; i < n; ++i) {
scanf("%d %d %d %d ", &slides[i].xmin,
&slides[i].xmax,
&slides[i].ymin,
&slides[i].ymax);
}

memset(graph, false, sizeof(graph));
memset(savedR, -1, sizeof(savedR));
exist = false;

int x, y;
for(int i = 0; i < n; ++i) {
scanf("%d %d ", &x, &y);
for(int j = 0; j < n; ++j)
if(ptInRect(x, y, slides[j]))
graph[i][j] = true;
}

int cnt = 0;
for(int u = 0; u < n; ++u) {
for(int v = 0; v < n; ++v) {
// each time, put one edge into the match
// and try to construct the max bipartite match based on that
if(!graph[u][v])
continue;

memset( matchL, -1, sizeof( matchL ) );
memset( matchR, -1, sizeof( matchR ) );

matchL[u] = v;
matchR[v] = u;

int cnt = 1;
for( int i = 0; i < n; ++ i){
if (i == u) continue;
memset(seen, 0, sizeof( seen ));
if(bpm(i)) cnt++;
}

if(exist) {
for(int k = 0; k < n; ++k)
//check uniqueness
if (matchR[k] != savedR[k])
savedR[k] = -1; // not unique, reset to -1
} else {
memcpy(savedR, matchR, sizeof(int)*n); // save the first solution
exist = true;
}
}
}

int cardinal = 0;
int i;
for(i = 0; i < n; ++i) {
if(savedR[i] != -1) {
++cardinal;
printf("(%c,%d)", i + 'A', savedR[i] + 1);
break;
}
}

for(++i; i < n; ++i) {
if(savedR[i] != -1) {
++cardinal;
printf(" (%c,%d)", i + 'A', savedR[i] + 1);
}
}

if(cardinal == 0)
printf("none");

printf("\n\n");
}
}

Hungarian Method

Easy to understand examples:

http://www.ee.oulu.fi/~mpa/matreng/eem1_2-1.htm

http://www.ams.jhu.edu/~castello/362/Handouts/hungarian.pdf

http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html

animation: http://www.ifors.ms.unimelb.edu.au/tutorial/hungarian/index.html

topcoder tutorial: Assignment Problem and Hungarian Algorithm (very good regarding theory and c++ code)

http://www.math.uwo.ca/~mdawes/courses/344/kuhn-munkres.pdf

Sunday, November 02, 2008

String Searching

http://allisons.org/ll/AlgDS/Strings/

http://en.wikipedia.org/wiki/String_searching_algorithm

http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm

Suffix Tree

Tries and suffix tree

wiki - Suffix Trees

http://allisons.org/ll/AlgDS/Tree/Suffix/

image

 

image

The suffix tree can be built in O(n) time (linear time regarding the length of strings) due to Ukkonen (1995).

Applications: [1,2,3] from http://allisons.org/ll/AlgDS/Tree/Suffix/

  1. String Search: Searching for a substring, pat[1..m], in txt[1..n], can be solved in O(m) time (after the suffix tree for txt has been built in O(n) time).
  2. Palindromes: The longest palindrome of txt[1..n] can be found in O(n) time, e.g. by building the suffix tree for txt$reverse(txt)# or by building the generalized suffix tree for txt and reverse(txt).
  3. Longest Common String:  The longest common substring of two strings, txt1 and txt2, can be found by building a generalized suffix tree for txt1 and txt2: Each node is marked to indicate if it represents a suffix of txt1 or txt2 or both. The deepest node marked for both txt1 and txt2 represents the longest common substring.  Equivalently, one can build a (basic) suffix tree for the string txt1$txt2#, where `$' is a special terminator for txt1 and `#' is a special terminator for txt2. The longest common substring is indicated by the deepest fork node that has both `...$...' and `...#...' (no $) beneath it.