Garbage Collection Ch.3-b

Overview
3.4 Hardware reference counting
3.5 Cyclic Reference Counting
3.6 Issues to consider
3.7 Notes
Summary

Overview

Ch 3.4: Hardware Reference Counting
Discuss the merits of hardware supports for reference counting.

Ch 3.5: Cyclic Reference Counting
Discuss how to reclaim cyclic data structures.

Ch 3.6: Issues to consider
Discuss why reference counting is chosen.

3.4 Hardware reference counting

Goal: Reduction of overhead by reference counting
Means: Using two memory
- Main Memory
- Reference counting memory (RCM)

ex) Update(R, S)

The CPU executes *R = S;
Memory is updated.
RCM manages reference counts.

Merits:
- The CPU doesn't have to manage reference counts.
- No synchronization required.
  - Synchronization between client programs and tracing garbage collectors
  - Locks on reference counts
But hardware reference counting has not been realized...
- Because of high development costs
- And mark-sweep is also required to reclaim cyclic garbages.

3.5 Cyclic Reference Counting

Reference counting cannot reclaim cyclic data structures
- Cyclic data structures are common
- ex) back-pointer, recursion in functional programming languages
Approach:
- Manpower
  - Burdensome, unsafe
- Hybrid memory manager
  - Reference counting + mark-sweep
- Utilization of domain features
  - Functional programming languages
  - Certain programming idioms

※ None of the approaches have been adopted for use by significant systems.

3.5.1 Functional Programming Language

Functional programming languages have recursions.
```
fun loop() = loop()
```
Features of cyclic data structures in functional programming languages:
- The circular structure is created all at once;
- Any use of a subset of the cycle without root is copied;
- Cycle-closing pointers to the head are tagged as such;
These features ensure that:
- The cycle is treated as a single entity.
- If the head is deleted, the entire cycle can be reclaimed.

3.5.2 Bobrow's Technique

General Idea

Distinguish internal references in cycle from external references.

Internal references are not be counted.
External references are counted as references to the cycle as a whole.

Bobrow's Idea

Programmer assigns all cells to a group.

Each group is reference counted.
Each group is reclaimed as a whole.

Algorithm

Reference count manipulations by Update.

Update(R: ptr, S: obj) =
  T = *R
  gr = group_no(R)
  if gr != group_no(S)    // R and S are in different groups
    increment_groupRC(S)
  if gr != group_no(T)    // R and T are in different groups
    decrement_groupRC(T)
    if groupRC(T) == 0
      reclaim_group(T)
  *R = S

Question

Q1: Update(A, B)

RC(group 1) = 1, RC(group 2) = 0

Q2: Update(B, C)

RC(group 1) = 1, RC(group 2) = 2

Drawbacks

Cannot reclaim sub-groups
Cannot reclaim inter-group cycles

Best practice

Each group comprise a single strongly connected component.

Minimal size of groups
No inter-group cycles

Strongly Connected Component (SCC)

Definition: Set of nodes each of which is reachable from every other node.

ex)

graph LR subgraph scc0 a --> b --> e --> a end subgraph scc1 f --> g --> f end subgraph scc2 c --> d --> h h --> d --> c end b --> c c --> g e --> f

3.5.3 Weak-pointer Algorithms

General Idea

Distinguish closing pointer(weak pointer) from other references.

Two invariants:
- Active nodes are reachable from root via a chain of strong pointers;
- Strong pointers do not form cycles;

Brownbridge's Idea

Each cell has two reference counters.

Strong reference counters (SRC)
Weak reference counters (WRC)

Algorithms

New

Allocate a new cell and return a strong pointer.

Because allocating new cells cannot create cycles.

New() = 
  newcell = allocate()
  SRC(R) = 1
  return strong(newcell)    // return strong pointer

Update

All copies of pointers are weak.

Because copying pointers may create cycles.
- Conservative approach
- This may break the following invariant:
  - Active nodes are reachable from root via a chain of strong pointers;
- delete corrects pointer strengths (See below)

Update(R: ptr, S: obj) = 
  WRC(S) = WRC(S) + 1
  delete(*R)
  *R = S
  weaken(*R)  // weaken pointer R

Delete

Decrement reference counts according to the following rules:

Delete weak reference: decrement weak reference counts
Delete strong reference
- If SRC == 0 and WRC == 0: delete cells
- If SRC == 0 and WRC > 0: correct pointer type

delete(T) = 
  if is_weak(T)                         // (i)
    WRC(T) -= 1
  else
    SRC(T) -= 1
    if SRC(T) == 0 and WRC(T) == 0      // (ii)
      for U in Children(T)
        delete(*U)
      free(T)
    else if SRC(T) == 0 and WRC(T) > 0  // (iii)
      invert_strength(T)
      for U in Children(T)
        suicide(T, *U)
      if SRC(T) == 0
        for U in Children(T)
          delete(*U)
        free(T)

Rule (i): Delete weak reference

Do not delete the cell because of the invariant "Active nodes are pointed by strong pointers".

WRC(T) -= 1

Rule (ii): Delete cells

SRC(T) -= 1
if SRC(T) == 0 and WRC(T) == 0
  for U in Children(T)
    delete(*U)
  free(T)

Rule (iii): Correct pointer type

SRC(T) -= 1
if SRC(T) == 0 and WRC(T) > 0
  // T is active
  // or cyclic garbage
  invert_strength(T)
  for U in Children(T)
    suicide(T, *U)
  if SRC(T) == 0
    for U in Children(T)
      delete(*U)
    free(T)

// Adjust pointer's strength
suicide(Start_node, S) = 
  if is_strong(S)
    if S == Start_node
      weaken(S)
    else if SRC(S) > 1
      // loop header candidate
      weaken(S)
    else for T in Children(S)
      suicide(Start_node, *T)

Question

Answer pointer's strengths after suicide(T, S)

Failure Example of Brownbridge's delete

delete discards cyclic structures incorrectly in the following example.

Because this algorithm does not take weak external references into consideration.
SRC(S) > 1

Solution: Salkild's suicide

Salkild's suicide

suicide should restart if it discovers may-active-loop nodes.

If suicide discover a cell with weak pointers but only one strong pointer;
- Flip all pointers
- Restart suicide from the cell

Failure Example

Salkild's suicide fails to terminate in the following example.

Solution: Marking scheme
Drawbacks: Poor efficiency, High space overhead(2 ref counts, mark-flag)

Weak pointer implementation

Strength-bit represents strong and weak pointer.

Each pointer and object have an strength-bit.
If pointer's bit == object's bit, the pointer is strong.
If pointer's bit != object's bit, the pointer is weak.

Strengthening or Weakening is done by flip pointer's strength bit.
Strengthening all weak pointers is done by flip object's strength bit.

3.5.4 Partial Mark-Sweep Algorithms

General Idea

Hybrid method of reference counting and mark-sweep.

Mark-sweep reclaims cycles.
Reference counting reclaims other cells.

How-to reclaim cycles (Trial Deletion):

Find cells that may be cycle garbage
Traverse these cells and decrement the reference counts
If the reference counts are zero, these cells must be cycle garbage!!

Lins's Algorithm

Hybrid method
- Reference counting reclaims uniquely referenced cells.
- Mark-sweep reclaims shared cells.
  - Shared cells include cycles.
Reclaim cycles lazily
- Same as mark-sweep
Traverse only cyclic garbage
- Mark-sweep traverses all active cells

Colors (States)

Lins's algorithm colors cells with 4 color.

Black: active, liveness
Grey: marked (in mark-sweep)
White: garbages
Purple: may-be-members of cyclic garbages
- Mark-sweep collector needs to traverse them.
- Control set holds purple cells.

Deletion

Decrement reference counts and Paint cells:

If the reference count is zero
- Free cells
If the reference count is not zero (shared)
- Paint purple
- Add the cell into control set
  - Control set holds purple cells.
  - Reclaimed by mark-sweep

delete(T) =
  RC(T) -= 1
  if RC(T) == 0
    color(T) = black          // for reuse (?)
    for U in Children(T)
      delete(*U)
    free(T)
  else if color(T) != purple  // shared
    if control_set is full
      gc_control_set()        // mark-sweep
    color(T) = purple
    push(T, control_set)

New

Allocate new cells painted with black (active).

New() = 
  newcell = allocate()
  RC(newcell) = 1
  color(newcell) = black
  return newcell

Update

Both arguments must be black (active).

If these cells are placed in control set, remove them from it.
- These cells are not garbages.

Update(R: ptr, S: obj) = 
  RC(S) += 1
  color(R) = black
  color(S) = black
  delete(*R)
  *R = S

Three-phase Mark-Sweep

Mark-sweep reclaims garbage from control set:

Get a purple cell from control set
Mark: remove internal reference counts in subgraph
- Garbages have no external references.
Scan: determine whether cells are active or garbages
Sweep: reclaim garbages

Algorithm:

gc_control_set() =
  S = pop(control_set)
  if color(S) == purple
    mark_gray(S)        // (i)
    scan(S)             // (ii)
    collect_white(S)    // (iii)
  else if control_set != empty
    gc_control_set()

Step-1: Mark (remove internal reference counts)

mark_grey(S) = 
  if color(S) != grey
    color = grey
    for T in Children(S)
      RC(*T) -= 1
      mark_grey(*T)

Step-2: Scan (determine whether cells are active or garbages)

scan(S) = 
  if color(S) == grey
    if RC(S) > 0
      scan_black(S)
    else  // RC(S) == 0
      // garbage
      color(S) = white
      for T in Children(S)
        scan(*T)

// Activate S and Children(S)
scan_black(S) = 
  color(S) = black
  for T in Children(S)
    RC(*T) += 1
    if color(*T) != black
      scan_black(*T)

Step-3: Sweep

collect_white(S) = 
  if color(S) == white
    color(S) = black
    for T in Children(S)
      collect_white(*T)
    free(S)

Question

mark_grey(A)

scan(A)

collect_white(A)

Control set strategies

Timing of mark-sweep
- When control set is full (Lins)
- When free-list is empty (same as mark-sweep)
- When the size of free-list drops below a certain size
- Or some heuristics
Data structure of control set
- Stack
- Queue
- Heap-allocated list
- Bitmap
Lins's approach works best when side-effects are rare.
- Many nodes are uniquely referenced.
  - Reference count reclaims them.
- Sub-graphs (cycles) are small.
  - Mark-sweep finishes quickly.

Christopher's Algorithm

Christopher's algorithm is a special case of Lin's algorithm.

Control set is the entire heap
- Perform mark-sweep for the entire heap
- Compared with Lins's, the following are not required:
  - Control set
  - Coloring
Features
- Automatic memory management without any support from compilers
  - Because this algorithm doesn't use coloring.
- No extra space required
  - Because this algorithm doesn't manage control set
- Linear sweeping in the entire heap
  - Linear sweeping is faster than tracing graph.
  - Whether Christopher's or Lins's is better depends on:
    - Data structures in question
    - Virtual memory management
    - Heap size
    - etc
Three-phase Mark-sweep (Almost same as Lins's)
1. Mark: remove internal reference counts
2. Scan: determine active cells using external reference counts
3. Sweep: reclaim any cells whose reference count is zero

3.6 Issues to consider

Reference counting have four weak points:

The delay to free garbages recursively
The high overhead imposed on pointer operations
The space required for the reference counts
The inability to reclaim garbage cycles

Nevertheless, many programmers often choose to use reference counting.
There may be four reasons for this choice.

Ease of implementation

Reference counting is easier to implement than tracing collectors because:

Pointer operations can be replaced by macros
- e.g., smart pointer in object-oriented languages
We don't need to know all the roots of computation
- The roots are important in tracing collectors

Control, optimization and correctness

Reference counting can provide programmers with total control.

Optimization
- Programmer can apply reference counting only to the objects which manual deallocation is difficult
Trade-off between performance and correctness
- Better performance if we optimize code
- Worse correctness if we optimize code
  - More difficult maintenance

Garbage collection delay

The overheads of reference count are distributed throughout the computation.

	reference count	tracing
Pointer operation	Slow	Fast
Pause	No	Yes
Immediate reuse optimization (e.g., destructive re-assignment)	Yes	No

Space overhead

Space for a reference count is required in each heap object's header.
The relative space advantage is application dependent:

Reference count
- High overhead for small cells
- Low overhead for large cells
Mark-sweep
- Some headroom for efficiency (to avoid thrashing)
Copying
- Double address space of mark-sweep

3.7 Notes

Reference count was originally developed for Lisp by Gorge Collins[1960]
Many studies
- Reduction of run-time cost
- Limited size reference counts
- Hardware reference count
- Reclaiming cycles
Studies in parallel systems
- Advantages
  - Synchronization between user program and garbage collection threads is not required.
- Disadvantages
  - Locks on each object's reference count are required.
Studies in distributed systems
- Advantages
  - Short pause time
- Difficulties
  - Ensuring that count manipulation messages arrive in the right order.

Summary

In this chapter, we discuss solutions to reclaim cyclic data structures in reference count:

Approaches for functional programming languages
Grouping by programmer
Weak pointers which represent cycle-closing pointers
Hybrid approaches of reference count and mark-sweep

However, none of the approaches have been adopted for use by significant systems.
The simplest approach (reference counting + mark-sweep) may be the best.

Contents

Overview

3.4 Hardware reference counting

3.5 Cyclic Reference Counting

3.5.1 Functional Programming Language

3.5.2 Bobrow's Technique

General Idea

Bobrow's Idea

Algorithm

Question

Drawbacks

Best practice

Strongly Connected Component (SCC)

3.5.3 Weak-pointer Algorithms

General Idea

Brownbridge's Idea

Algorithms

New

Update

Delete

Question

Failure Example of Brownbridge's delete

Salkild's suicide

Failure Example

Weak pointer implementation

3.5.4 Partial Mark-Sweep Algorithms

General Idea

Lins's Algorithm

Colors (States)

Deletion

New

Update

Three-phase Mark-Sweep

Question

Control set strategies

Christopher's Algorithm

3.6 Issues to consider

Ease of implementation

Control, optimization and correctness

Garbage collection delay

Space overhead

3.7 Notes

Summary