2 * Copyright (c) 1996 Barton P. Miller
4 * We provide the Paradyn Parallel Performance Tools (below
5 * described as Paradyn") on an AS IS basis, and do not warrant its
6 * validity or performance. We reserve the right to update, modify,
7 * or discontinue this software at any time. We shall have no
8 * obligation to supply such updates or modifications or any other
9 * form of support to you.
11 * This license is for research uses. For such uses, there is no
12 * charge. We define "research use" to mean you may freely use it
13 * inside your organization for whatever purposes you see fit. But you
14 * may not re-distribute Paradyn or parts of Paradyn, in any form
15 * source or binary (including derivatives), electronic or otherwise,
16 * to any other organization or entity without our permission.
18 * (for other uses, please contact us at paradyn@cs.wisc.edu)
20 * All warranties, including without limitation, any warranty of
21 * merchantability or fitness for a particular purpose, are hereby
24 * By your use of Paradyn, you understand and agree that we (or any
25 * other person or entity with proprietary rights in Paradyn) are
26 * under no obligation to provide either maintenance services,
27 * update services, notices of latent defects, or correction of
28 * defects for Paradyn.
30 * Even if advised of the possibility of such damages, under no
31 * circumstances shall we (or any other person or entity with
32 * proprietary rights in the software licensed hereunder) be liable
33 * to you or any third party for direct, indirect, or consequential
34 * damages of any character regardless of type of action, including,
35 * without limitation, loss of profits, loss of use, loss of good
36 * will, or computer failure or malfunction. You agree to indemnify
37 * us (and any other person or entity with proprietary rights in the
38 * software licensed hereunder) for any and all liability it may
39 * incur to third parties resulting from your use of Paradyn.
44 // This class is intended to be used only in paradynd.
45 // This templated class manages a heap of objects in a UNIX shared-memory segment
46 // (currently, the template types that are used are either intCounter or tTimer).
47 // Note that the shm segment itself is managed by our "parent" class,
48 // fastInferiorHeapMgr.h/.C. This class doesn't actually use any shared-memory
51 // Previously, this class would manage its own shared-memory segment, so that
52 // if a process creates, say, a fastInferiorHeap<intCounter> and two
53 // fastInferiorHeap<tTimer>, then 3 shm segments are created (and 6 shmat()'s
54 // are done, since both paradynd and the application need to shmat() to a given
55 // segment). To cut down on the number of segments floating around, and to reduce
56 // and hopefully avoid the dreaded EMFILE errno when shmat()'ing to a segment,
57 // there's now just 1 shm segment per process (and 2 shmat()'s are done).
58 // The actual shm seg management is now done in file fastInferiorHeapMgr.h/.C
60 #ifndef _FAST_INFERIOR_HEAP_H_
61 #define _FAST_INFERIOR_HEAP_H_
65 #include "util/h/Vector.h"
67 class process; // avoids need for an expensive #include
69 template <class HK, class RAW>
70 // where HK is the housekeeping information for something like "counter" or "timer"
71 // (with entries for everything except the actual counter or timer value) and where
72 // RAW is the same raw type used in the appl heap; presumably "int", etc.
73 class fastInferiorHeap {
75 enum states {allocated, free, pendingfree, maybeAllocatedByFork};
77 process *inferiorProcess;
78 // ptr instead of ref due to include file problems (if this file is included w/in
79 // class process then class process isn't fully defined when we reach this point
80 // so it won't let us use a ref).
82 // Let's take a moment to think about whether a 'baseAddrInApplic' is needed.
83 // 1) It's not needed in order for paradynd to write to the shared seg;
84 // baseAddrInParadynd is used for that.
85 // 2) But it is needed for paradynd to add instrumentation code that uses
86 // an object. For example, say paradynd wants to add code instrumentation
87 // which calls startTimer() with (this is what's important here) the argument of
88 // a ptr to the tTimer. To do this, it needs the addr in the inferior appl, not
89 // the addr in paradynd's attachment.
91 RAW * baseAddrInApplic;
92 // When ctor #1 (not the fork ctor) is used, this vrble is undefined until
93 // setBaseAddrInApplic() is called.
95 RAW * baseAddrInParadynd;
97 vector<states> statemap; // one entry per value (allocated or not) in the appl.
98 vector<HK> houseKeeping; // one entry per value (allocated or not) in the appl.
100 unsigned firstFreeIndex;
101 // makes allocation quick; UINT_MAX --> no free elems in heap (but there could be
102 // pending-free items)
104 // Keeps track of what needs sampling, needed to sort out major/minor sampling
105 vector<unsigned> permanentSamplingSet; // all allocated indexes
106 vector<unsigned> currentSamplingSet; // a subset of permanentSamplingSet
108 // Since we don't define these, making them private makes sure they're not used:
109 fastInferiorHeap &operator=(const fastInferiorHeap &);
110 fastInferiorHeap(const fastInferiorHeap &);
112 void reconstructPermanentSamplingSet();
116 fastInferiorHeap(RAW *iBaseAddrInParadynd,
117 process *iInferiorProcess,
118 unsigned heapNumElems);
119 // Note that the ctor has no way to pass in the baseAddrInApplic because
120 // the applic hasn't yet attached to the segment. When the applic attaches
121 // and tells us where it attached, we can call setBaseAddrInApplic() to fill
124 fastInferiorHeap(const fastInferiorHeap &parent, process *newProc,
125 void *paradynd_attachedAt,
126 void *appl_attachedAt);
127 // this copy-ctor is a fork()/dup()-like routine. Call after a process forks.
128 // From the process' point of view after the fork(): the fork() has attached it
129 // to all shm segments of its parent; so, it needs to unattach() from them
130 // and then attach to a new segment.
135 // call after the exec syscall has executed. Basically we need to redo everything
136 // that we did in the (non-fork) constructor. Well, some things don't change: the
137 // process ptr, the addr that paradynd attached at, the shm seg key.
139 void forkHasCompleted();
140 // call when a fork has completed (i.e. after you've called the fork ctor AND
141 // also metricDefinitionNode::handleFork, as forkProcess [context.C] does).
142 // performs some assertion checks, such as mi != NULL for all allocated HKs.
143 // also recalculates some things, such as the sampling sets.
145 void setBaseAddrInApplic(RAW *addr) {
146 // should call _very_ soon after the ctor, right after the applic has
147 // attached to the shm segment.
148 assert(baseAddrInApplic == NULL); // not for long...
149 baseAddrInApplic = addr;
152 RAW *getBaseAddrInApplic() const {
153 assert(baseAddrInApplic != NULL);
154 return baseAddrInApplic;
157 RAW *index2InferiorAddr(unsigned allocatedIndex) const {
158 assert(baseAddrInApplic != NULL);
159 assert(allocatedIndex < statemap.size());
160 return baseAddrInApplic + allocatedIndex;
163 RAW *index2LocalAddr(unsigned allocatedIndex) const {
164 assert(baseAddrInParadynd != NULL);
165 assert(allocatedIndex < statemap.size());
166 return baseAddrInParadynd + allocatedIndex;
169 void initializeHKAfterFork(unsigned index, const HK &iHKValue);
170 // After a fork, the hk entry for this index will be copied, which is probably
171 // not what you want. (We don't provide a param for the raw item since you
172 // can write the raw item easily enough by just writing directly to shared
173 // memory...see baseAddrInParadynd.)
175 bool alloc(const RAW &iRawValue, const HK &iHouseKeepingValue,
176 unsigned &allocatedIndex);
177 // Allocate an entry in the inferior heap and initialize its raw value with
178 // "iRawValue" and its housekeeping value with "iHouseKeepingValue".
179 // Returns true iff successful; false if not (because inferior heap was full).
180 // If true, then also sets "allocatedIndex" to the index (0 thru heapNumElems-1)
181 // that was allocated in param "allocatedIndex"; this info is probably needed by
182 // the caller. For example, the caller might be allocating a mini-tramp
183 // that calls startTimer(); it needs "allocatedIndex" to calculate the
184 // address of the timer structure (which is passed to startTimer).
186 // Note: we write to the inferior heap, in order to fill in the initial value.
187 // Since the inferior heap is in shared memory, we don't need to write it
188 // using /proc lseek/write() combo. Which means the inferior process
189 // doesn't necessarily need to be paused (which is slow as hell to do).
190 // This is nice, but remember:
191 // (1) /proc writes are still needed at least for patching code to
192 // cause a jump to a base tramp. Conceivably though, the base tramp and
193 // all mini-tramps (and of course counters & timers they write to) can be
194 // in shared-memory, which is nice.
195 // (2) if you're not going to pause the application, beware of race
196 // conditions. A nice touch is to create backwards; i.e. allocate the
197 // timer/counter in shared memory, then the mini-tramp in shared memory,
198 // then the base tramp in shared memory if need be, then initialize the
199 // code contents of the tramps, and only then patch up some function
200 // to call our base tramp (the last step using a pause; /proc write; unpause
203 void makePendingFree(unsigned ndx, const vector<unsigned> &trampsUsing);
204 // "free" an item in the shared-memory heap. More specifically, change its
205 // statemap type from allocated to pending-free. A later call to garbageCollect()
206 // is the only way to truly free the item. An item in pending-free
207 // state will no longer be processed by processAll().
208 // Note that we don't touch the shared-memory heap; we just play around with
209 // statemap meta-data stuff. Of course that doesn't mean that it's okay to call
210 // this with the expectation that the raw item will still be written to, except
211 // perhaps by a tramp that is itself in the process of being freed up.
213 void garbageCollect(const vector<unsigned> &PCs);
214 // called by alloc() if it needs memory, but it's a good idea to call this
215 // periodically; progressive preemptive garbage collection can help make
216 // allocations requests faster.
217 // The parameter is a stack trace in the inferior process, containing PC-register
220 bool doMajorSample(time64 wallTime, time64 procTime);
221 // Reads data values (in the shared memory heap) and processes allocated item
222 // by calling HK::perform() on it.
223 // Note: doesn't pause the application; instead, reads from shared memory.
224 // returns true iff the sample completed successfully.
226 bool doMinorSample();
227 // call every once in a while after a call to doMajorSample() returned false.
228 // It'll resample and return true iff the re-sample finished the job. We keep
229 // around state (currentSamplingSet) to keep track of what needs re-sampling;
230 // this is reset to 'everything' (permanentSamplingSet) upon a call to
231 // doMajorSample() and is reduced by that routine and by calls to doMinorSample().