### Refine

#### Document Type

- Article (2)
- Conference Proceeding (1)
- Doctoral Thesis (1)

#### Language

- English (4)

#### Has Fulltext

- yes (4)

#### Is part of the Bibliography

- no (4)

#### Keywords

- Connected Components (1)
- Experimental Evaluation (1)
- External Memory (1)
- Graph Algorithms (1)
- I/O efficiency (1)
- Randomization (1)
- algorithms (1)
- data structures (1)
- external memory (1)

#### Institute

- Geowissenschaften (2)
- Informatik (1)
- Informatik und Mathematik (1)

This paper presents an analysis of the recent tropospheric molecular hydrogen (H2) budget with a particular focus on soil uptake and surface emissions. A variational inversion scheme is combined with observations from the RAMCES and EUROHYDROS atmospheric networks, which include continuous measurements performed between mid-2006 and mid-2009. Net H2 surface flux, soil uptake distinct from surface emissions and finally, soil uptake, biomass burning, anthropogenic emissions and N2 fixation-related emissions separately were inverted in several scenarios. The various inversions generate an estimate for each term of the H2 budget. The net H2 flux per region (High Northern Hemisphere, Tropics and High Southern Hemisphere) varies between −8 and 8 Tg yr−1. The best inversion in terms of fit to the observations combines updated prior surface emissions and a soil deposition velocity map that is based on soil uptake measurements. Our estimate of global H2 soil uptake is −59 ± 4.0 Tg yr−1. Forty per cent of this uptake is located in the High Northern Hemisphere and 55% is located in the Tropics. In terms of surface emissions, seasonality is mainly driven by biomass burning emissions. The inferred European anthropogenic emissions are consistent with independent H2 emissions estimated using a H2/CO mass ratio of 0.034 and CO emissions considering their respective uncertainties. To constrain a more robust partition of H2 sources and sinks would need additional constraints, such as isotopic measurements.

This paper presents an analysis of the recent tropospheric molecular hydrogen (H2) budget with a particular focus on soil uptake and European surface emissions. A variational inversion scheme is combined with observations from the RAMCES and EUROHYDROS atmospheric networks, which include continuous measurements performed between mid-2006 and mid-2009. Net H2 surface flux, then deposition velocity and surface emissions and finally, deposition velocity, biomass burning, anthropogenic and N2 fixation-related emissions were simultaneously inverted in several scenarios. These scenarios have focused on the sensibility of the soil uptake value to different spatio-temporal distributions. The range of variations of these diverse inversion sets generate an estimate of the uncertainty for each term of the H2 budget. The net H2 flux per region (High Northern Hemisphere, Tropics and High Southern Hemisphere) varies between −8 and +8 Tg yr−1. The best inversion in terms of fit to the observations combines updated prior surface emissions and a soil deposition velocity map that is based on bottom-up and top-down estimations. Our estimate of global H2 soil uptake is −59±9 Tg yr−1. Forty per cent of this uptake is located in the High Northern Hemisphere and 55% is located in the Tropics. In terms of surface emissions, seasonality is mainly driven by biomass burning emissions. The inferred European anthropogenic emissions are consistent with independent H2 emissions estimated using a H2/CO mass ratio of 0.034 and CO emissions within the range of their respective uncertainties. Additional constraints, such as isotopic measurements would be needed to infer a more robust partition of H2 sources and sinks.

We empirically investigate algorithms for solving Connected Components in the external memory model. In particular, we study whether the randomized O(Sort(E)) algorithm by Karger, Klein, and Tarjan can be implemented to compete with practically promising and simpler algorithms having only slightly worse theoretical cost, namely Borůvka’s algorithm and the algorithm by Sibeyn and collaborators. For all algorithms, we develop and test a number of tuning options. Our experiments are executed on a large set of different graph classes including random graphs, grids, geometric graphs, and hyperbolic graphs. Among our findings are: The Sibeyn algorithm is a very strong contender due to its simplicity and due to an added degree of freedom in its internal workings when used in the Connected Components setting. With the right tunings, the Karger-Klein-Tarjan algorithm can be implemented to be competitive in many cases. Higher graph density seems to benefit Karger-Klein-Tarjan relative to Sibeyn. Borůvka’s algorithm is not competitive with the two others.

This thesis presents research which spans three conference papers and one manuscript which has not yet been submitted for peer review.
The topic of 1 is the inherent complexity of maintaining perfect height in B-trees. We consider the setting in which a B-tree of optimal height contains n = (1−ϵ)N elements where N is the number of elements in full B-tree of the same height (the capacity of the tree). We show that the rebalancing cost when updating the tree—while maintaining optimal height—depends on ϵ. Specifically, our analysis gives a lower bound for the rebalancing cost of Ω(1/(ϵB)). We then describe a rebalancing algorithm which has an amortized rebalancing cost with an almost matching upper bound of O(1/(ϵB)⋅log²(min{1/ϵ,B})). We additionally describe a scheme utilizing this algorithm which, given a rebalancing budget f(n), maintains optimal height for decreasing ϵ until the cost exceeds the
budget at which time it maintains optimal height plus one. Given a rebalancing budget of Θ(logn), this scheme maintains optimal height for all but a vanishing fraction of sizes in the intervals between tree capacities.
Manuscript 2 presents empirical analysis of practical randomized external-memory algorithms for computing the connected components of graphs. The best known theoretical results for this problem are essentially all derived from results for minimum spanning tree algorithms. In the realm of randomized external-memory MST algorithms, the best asymptotic result has I/O-complexity O(sort(|E|)) in expectation while an empirically studied practical algorithm has a bound of O(sort(|E|)⋅log(|V|/M)). We implement and evaluate an algorithm for connected components with expected I/O-complexity O(sort(|E|))—a simplification of the MST
algorithm with this asymptotic cost, we show that this approach may also yield good results in practice.
In paper 3, we present a novel approach to simulating large-scale population protocol models. Naive simulation of N interactions of a population protocol with n agents and m states requires Θ(nlogm) bits of memory and Θ(N) time. For
very large n, this is prohibitive both in memory consumption and time, as interesting protocols will typically require N > n interactions for convergence. We describe a histogram-based simulation framework which requires Θ(mlogn) bits of memory instead—an improvement as it is typically the case that
n ≫ m. We analyze, implement, and compare a number of different data structures to perform correct agent sampling in this regime. For this purpose, we develop dynamic alias tables which allow sampling an interaction in expected amortized
constant time. We then show how to use sampling techniques to process agent interactions in batches, giving a simulation approach which uses subconstant time per interaction under reasonable assumptions.
With paper 4, we introduce the new model of fragile complexity for comparison-based algorithms. Within this model, we analyze classical comparison-based problems such as finding the minimum value of a set, selection (or finding the median), and sorting. We prove a number of lower and upper bounds and in particular, we give a number of randomized results which describe trade-offs not achievable by deterministic algorithms.