Basic objects#
A striplog
depends on a hierarchy of objects. This notebook shows the objects and their basic functionality.
Lexicon: A dictionary containing the words and word categories to use for rock descriptions.
Component: A set of attributes.
Interval: One element from a Striplog — consists of a top, base, a description, one or more Components, and a source.
Striplogs (a set of Interval
s) are described in a separate notebook.
Decors and Legends are also described in another notebook.
import striplog
striplog.__version__
# If you get a lot of warnings here, just run it again.
'unknown'
Lexicon#
from striplog import Lexicon
print(Lexicon.__doc__)
A Lexicon is a dictionary of 'types' and regex patterns.
Most commonly you will just load the default one.
Args:
params (dict): The dictionary to use. For an example, refer to the
default lexicon in ``defaults.py``.
help(Lexicon)
Help on class Lexicon in module striplog.lexicon:
class Lexicon(builtins.object)
| Lexicon(params)
|
| A Lexicon is a dictionary of 'types' and regex patterns.
|
| Most commonly you will just load the default one.
|
| Args:
| params (dict): The dictionary to use. For an example, refer to the
| default lexicon in ``defaults.py``.
|
| Methods defined here:
|
| __init__(self, params)
| Initialize self. See help(type(self)) for accurate signature.
|
| __repr__(self)
| Return repr(self).
|
| __str__(self)
| Return str(self).
|
| expand_abbreviations(self, text)
| Parse a piece of text and replace any abbreviations with their full
| word equivalents. Uses the lexicon.abbreviations dictionary to find
| abbreviations.
|
| Args:
| text (str): The text to parse.
|
| Returns:
| str: The text with abbreviations replaced.
|
| find_synonym(self, word)
| Given a string and a dict of synonyms, returns the 'preferred'
| word. Case insensitive.
|
| Args:
| word (str): A word.
|
| Returns:
| str: The preferred word, or the input word if not found.
|
| Example:
| >>> syn = {'snake': ['python', 'adder']}
| >>> find_synonym('adder', syn)
| 'snake'
| >>> find_synonym('rattler', syn)
| 'rattler'
|
| TODO:
| Make it handle case, returning the same case it received.
|
| find_word_groups(self, text, category, proximity=2)
| Given a string and a category, finds and combines words into
| groups based on their proximity.
|
| Args:
| text (str): Some text.
| tokens (list): A list of regex strings.
|
| Returns:
| list. The combined strings it found.
|
| Example:
| COLOURS = [r"red(?:dish)?", r"grey(?:ish)?", r"green(?:ish)?"]
| s = 'GREYISH-GREEN limestone with RED or GREY sandstone.'
| find_word_groups(s, COLOURS) --> ['greyish green', 'red', 'grey']
|
| get_component(self, text, required=False, first_only=True)
| Takes a piece of text representing a lithologic description for one
| component, e.g. "Red vf-f sandstone" and turns it into a dictionary
| of attributes.
|
| TODO:
| Generalize this so that we can use any types of word, as specified
| in the lexicon.
|
| parse_description(self, text)
| Parse a single description into component-like dictionaries.
|
| split_description(self, text)
| Split a description into parts, each of which can be turned into
| a single component.
|
| ----------------------------------------------------------------------
| Class methods defined here:
|
| default() from builtins.type
| Makes the default lexicon, as provided in ``defaults.py``.
|
| Returns:
| Lexicon: The default lexicon.
|
| from_json_file(filename) from builtins.type
| Load a lexicon from a JSON file.
|
| Args:
| filename (str): The path to a JSON dump.
|
| ----------------------------------------------------------------------
| Readonly properties defined here:
|
| categories
| Lists the categories in the lexicon, except the
| optional categories.
|
| Returns:
| list: A list of strings of category names.
|
| ----------------------------------------------------------------------
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
lexicon = Lexicon.default()
lexicon
{'lithology': ['overburden', 'sandstone', 'siltstone', 'shale', 'conglomerate', 'mudstone', 'limestone', 'dolomite', 'salt', 'halite', 'anhydrite', 'gypsum', 'sylvite', 'clay', 'mud', 'silt', 'sand', 'gravel', 'boulders'], 'modifier': ['silty', 'sandy', 'shale?y', 'muddy', 'pebbly', 'gravell?y'], 'amount': ['streaks?', 'veins?', 'stringers?', 'interbed(?:s|ded)?', 'blotch(?:es)?', 'bands?', 'fragments?', 'impurit(?:y|ies)', 'abundant', 'minor', 'some', 'rare', 'flakes?', 'trace', '[-.\\d]+%', '[-.\\d]+pc', '[-.\\d]+per ?cent'], 'grainsize': ['vf(?:-)?', 'f(?:-)?', 'm(?:-)?', 'c(?:-)?', 'vc', 'very fine(?: to)?', 'fine(?: to)?', 'medium(?: to)?', 'coarse(?: to)?', 'very coarse', 'v fine(?: to)?', 'med(?: to)?', 'med.(?: to)?', 'v coarse', 'grains?', 'granules?', 'pebbles?', 'cobbles?', 'boulders?'], 'colour': ['red(?:dish)?', 'gray(?:ish)?', 'grey(?:ish)?', 'black(?:ish)?', 'whit(?:e|ish)', 'blu(?:e|ish)', 'purpl(?:e|ish)', 'yellow(?:ish)?', 'green(?:ish)?', 'brown(?:ish)?', 'light', 'dark', 'sandy'], 'synonyms': {'Overburden': ['Drift'], 'Anhydrite': ['Gypsum'], 'Salt': ['Halite', 'Sylvite']}, 'splitters': [' with ', ' contain(?:s|ing) ', '\\. '], 'parts_of_speech': {'noun': ['lithology'], 'adjective': ['colour', 'grainsize', 'modifier'], 'subordinate': ['amount']}, 'abbreviations': {'gt': 'gritty', 'dist': 'distillate', 'gr': 'grained', 'LSD': 'legal subdivision', 'ptg': 'parting', 'alg': 'algal', 'mnr': 'minor', 'Assem': 'Assem', 'Vad': 'vadose', 'MCW': 'mud cut water', 'Pch': 'patch', 'gd': 'good', '/': 'with', 'ga': 'gauged', 'bulb': 'bulbous', 'Var': 'variation', 'gn': 'green', 'alt': 'altering', 'Gil': 'gilsonite', 'Oyst': 'oyster', 'sch': 'schist', 'Clst': 'claystone', 'ves': 'vesicular', 'Nod': 'nodules', 'tt': 'tightly', 'V.P.S.': 'very poor sample V.P.S.', 'asph': 'asphaltic', 'SIP': 'shut in pressure', 'tn': 'tan', 'G&OCM': 'gas and oil cut mud', 'DI': 'dual induction log', 'Mol': 'mollusca', 'Inoc': 'inoceramus*', 'spec': 'speckled', '@': 'at', 'Amph': 'amphipora*', 'vert': 'vertical', 'lrg': 'larger', 'md': 'muddy', 'wthrd': 'weathered', 'Brec': 'breccia', 'D & A': 'dry and abandoned', 'fau': 'fauna', 'depau': 'depauperate', 'shad': 'shadow', 'DF': 'derrick floor', 'gy': 'gray', 'Elev': 'elevation', 'indst': 'indistinct', 'Syring': 'syringopora*', 'fac': 'faceted', 'GL': 'guard log', 'brhg': 'branching', 'Gt': 'grit', 'Gr': 'grains', 'FIt': 'fault', 'Tas': 'tasmanites*', 'Spr': 'spore', 'fenst': 'fenestral', 'plag': 'plagioclase', 'p': 'poorly', 'Spk': 'speck', 'Spl': 'sampole', 'SNP': 'sidewall neutron porosity log', 'n.s.': 'no sample', 'Clcar': 'calcarenite', 'pbl': 'pebble (4-64 mm)', 'cmt': 'cemented', 'Spg': 'sponge', 'Grv': 'gravel', 'Cyp': 'cypridopsis*', 'Grt': 'granite', 'SP': 'spontaneous potential', 'Res': 'residue', 'Bioh': 'bioherm', 'Biost': 'biostrom', 'tex': 'texture', 'Brach': 'brachiopod', 'hydc': 'hydrocarbon', 'Ren': 'renalcis*', 'Biot': 'biotite', 'Peld': 'pelletoid', 'Bld': 'boulder', 'ter': 'terriginous', 'splty': 'splintery', 'acic': 'acicular', 'sug': 'sugary', 'bdd': 'bedded', 'bdg': 'bedding', 'gl': 'glassy', 'abd': 'abundant', 'Intst': 'intersticies', 'Micr': 'micrite', 'O str': 'Ostracod', 'Clcsp': 'calcisphere', 'bl': 'bluish', 'abs': 'absent', 'abt': 'about', 'conspic': 'conspicuous', 'Spfool': 'superficial olite', 'sli': 'slightly', 'med': 'medium', 'O&SW': 'oil and salt water', 'biocl': 'bioclastic', 'Agg': 'aggregate', 'DST': 'drill stem test', 'Mn': 'manganese', 'psdo': 'pseudo', 'len': 'lentilcular', 'rug': 'rugosa', 'Wd': 'wood', 'hkl': 'hackly', 'DLL': 'dual laterolog', 't.s.': 'thin section', 'Mtrx': 'matrix', 'crpxln': 'crystocrystalline', 'Zn': 'zone', 'Anthr': 'anthracite', 'altg': 'alternating', 'blksh': 'blackish', 'bor': 'boreding', 'dru': 'drusy', 'ctd': 'coated', 'Conc': 'concretion', 'wtr cush': 'water cushion', 'ctc': 'contact', 'micropor': 'microporosity', 'Zr': 'zircon', 'Cono': 'conodont', 'sft': 'soft', 'Piso': 'pisoid', 'LL': 'laterolog', 'lmn': 'limonitic', 'volc': 'volcanics', 'Ctgs': 'cuttings', 'apr': 'apparent', 'tr': 'trace', 'app': 'appear', 'brt': 'bright', 'ls': 'limestone', 'Slt': 'silt', 'Circ': 'circulate', 'aph': 'aphanitic', 'oomol': 'oomoldic', 'brk': 'brackish', 'vit': 'vitreous', 'org': 'organic', 'n.v.p.': 'no visible porosity', 'w/': 'with', 'brd': 'bored', 'intxln': 'intercrystalline', 'Para': 'paraparchites*', 'Fp': 'flowing pressure', 'Ft': 'foot', 'perf': 'perforated', '&': 'and', 'fen': 'fenestraal', 'pred': 'predominantly', 'Phlog': 'phloaopite', 'perm': 'permeability', 'fibr': 'fibrous', 'msm': 'metasomatic', 'cht': 'chert', 'Fe': 'iron-ferruginous', 'dns': 'denser', 'vi': 'violet', 'fros': 'frosted', 'calc': 'calcitareous', 'Fm': 'formation', 'pres': 'preservation', 'cntrt': 'contorted', 'wvy': 'wavy', 'vps': 'very poor samples', 'ooc': 'oocastic', 'musc': 'muscovite', 'glas': 'glassy', 'Sh': 'shale', 'lchd': 'leached', 'rhb': 'rhombic', 'glau': 'glauconitic', 'Brac': 'brachiopod', 'choc': 'chocolate', 'chit': 'chitinous', 'Str': 'structure', 'Sl': 'slate', 'clus': 'cluster', 'Sa': 'salt', 'Cmt': 'cement', 'lmpy': 'lumpy', 'Sd': 'sand', 'intrapar': 'intraparticle', 'phos': 'phosphatic', 'prly': 'pearly', 'Lstr': 'lustre', 'tns': 'tension', 'f': 'finely', 'fls': 'flesh', 'des': 'descript', 'SW': 'salt water', 'flt': 'faulted', 'Cbl': 'cobble', 'plty': 'platy', 'lithgr': 'lithographic', 'flk': 'flake', 'W.R.': 'washed residue', 'SO': 'show of oil', 'v': 'very', 'deb': 'debris', 'slty': 'silty', 'fld': 'feldsparthic', 'flg': 'flaggy', 'ptch': 'patches', 'Lig': 'lignite', 'mica': 'micaeous', 'bent': 'bentonitic', 'Min': 'mineral', 'sy-Ca': 'sparry calcite', 'Lim': 'limonite', 'Invtb': 'invertebrate', 'sps': 'sparsly', 'Mid': 'middle', 'yelsh': 'ish', 'sph': 'spherules', 'spl': 'sample', 'Ls': 'limestone', 'tab': 'tabular', 'Plcy': 'palecypod', 'scat': 'scattered', 'psool': 'pseudo oolitic', 'GR': 'gamma ray', 'spsly': 'sparsly', 'Bdeye': 'birdseye', 'purp': 'purple', 'Pyr': 'pyrite', 'hom': 'homogeneous', 'Kao': 'kaolin', 'Spic': 'spicule', 'hor': 'horizontal', 'Belm': 'belemnites*', 'sid': 'sideritic', 'aft': 'after', 'Rhb': 'rhomb', 'Typ': 'type', 'sim': 'similar', 'sil': 'siliceous', 'frag': 'fragmental', 'lam': 'laminated', 'fr': 'fair', 'mar': 'maroon', 'frac': 'fractured', 'srt': 'sorting', 'Dol': 'dolomite', 'I.P.': 'in part', 'cpct': 'compact', 'max': 'maximum', 'insl': 'insoluble', 'lac': 'lacustrine', 'Mrl': 'marl', 'mag': 'magnetic', 'Ctc': 'contact', 'ireg': 'irregular', 'lav': 'lavender', 'IAB': 'initial air blow', 'fl': 'filled', 'Tril': 'trilobite', 'Foram': 'foraminifera', 'MMCFG': 'million cubic feet of gas', 'Clvg': 'cleavage', 'Alg': 'algal', 'sp': 'spotty', 'Deer': 'decrease', 'Microspr': 'microspar', 'su': 'sulphurous', 'or': 'orangish', 'strk': 'streaked', 'stri': 'striated', 'sh': 'shale', 'Moll': 'mollusc', 'spkld': 'speckled', 'sm': 'smooth', 'sl': 'slightly', 'sc': 'scales', 'sb': 'sub', 'sa': 'salt', 'strg': 'stringer', 'Trip': 'tripoli', 'Equiv': 'equivalent', 'lse': 'loose', 'mnut': 'minute', 'BHCS': 'bore hole compensated sonic', 'Cal': 'caliper', 'Chk': 'chalk', 'Biomi': 'biomicrite', 'brit': 'brittle', 'coln': 'colonial', 'Smwt': 'somewhat', 'Rf': 'reef', 'Cht': 'chert', '(D)': 'development', 'decr': 'decreasing', 'Rk': 'rock', 'RT': 'rotary table', 'fis': 'fissile', 'cln': 'clean', 'xl': 'crystalline', 'cbl': 'cobble (64-256 mm)', 'Aglm': 'agglomerate', 'Evap': 'evapourite', 'sa-c': 'salt castic', 'fib': 'fibrous', 'imp': 'impression', 'lt': 'lighter', 'eux': 'euxinic', 'cren': 'crenulated', 'gty': 'gravity', 'dess': 'dessiccation', 'clr': 'clear', 'TSTM': 'too small to measure', 'Bas': 'basalt', 'Shw': 'show', 'nod': 'nodule', 'uncons': 'unconsolidated', 'rexl': 'recrystallization', 'Cav': 'cavernous', 'FAB': 'fair air blow', 'aprox': 'approximately', '₵': 'core', 'Schm': 'schist', 'Shl': 'shell', 'xln': 'crystalline', 'Rad': 'radial', 'Oomol': 'oomold', 'Dist': 'distillate', 'extr': 'extremely', 'l': 'lower', 'Anhy': 'anhydrite', 'intv': 'interval', 'Stach': 'stachyodes*', 'crpxl': 'cryptocrystalline', 'Poln': 'pollen', 'Chtz': 'chitinozoa', 'MCFG': 'thousand cubic feet of gas', 'frmwk': 'framework', 'Lith': 'lithology', 'Macrofos': 'macrofossil', 'prphy': 'porphyry', 'Lut': 'lutite', 'OFM': 'oil flecked mud', 'Asph': 'assemblage', 'intpar': 'interparticle', 'Perm': 'permeability', 'Bdst': 'boundstone', 'chty': 'cherty', 'meta': 'metamorphic', 'ex': 'excellent', 'hrtl': 'horizontal', 'p-p': 'pin point', 'zeo': 'zeolite', 'n/s': 'no show', 'nac': 'nacerous', 'Wtr': 'water', 'apprx': 'approximate', 'rr': 'rare', 'Musc': 'muscovite', 'intclas': 'intraclastic', 'rep': 'replacedment', 'porcel': 'porcelaneous', 'BOPH': 'barrels of oil per hour', 'rd': 'rounded', 'rf': 'reefoid', 'sml': 'small', 'blk': 'black', 'IP': 'initial production', 'bld': 'bladed', 'Tub': 'tube', 'Pisol': 'pisolite', 'mtx': 'matrix', 'lmy': 'limy', 'Phos': 'phosphate', 'crd': 'cored', 'G': 'gas', 'Spo': 'spore', 'GCM': 'gas cut mud', 'SGCM': 'slight gas cut mud', 'OWWO': 'old well worked over', 'Novac': 'novaculite', 'rhmb': 'rhombic', 'argl': 'argillate', 'W': 'west', 'carb': 'carbonaceous', 'Frac': 'fracture', 'freq': 'frequent', 'ES': 'electric', 'Pyrxn': 'pyroxene', 'g': 'good', 'Contam': 'contamination', 'srtg': 'Sorteding', 'Chal': 'chalcedony', 'fspr': 'feldsparathic', 'Char': 'charophyte', 'SSO': 'slight show of oil', 'Hyde': 'hydrocarbon', 'w': 'well', 'intgn': 'inter grown', 'vrtb': 'vertebrate', 'Sphal': 'sphalerite', 'intpt': 'interpretation', 'cche': 'caliche', 'Frg': 'fringe', 'sacc': 'saccharoidal', 'onc': 'oncolites', 'Slick slick': 'slickenside', 'Diagn': 'diagenesis', 'zn': 'zone', 'Imp': 'impression', 'sblit': 'sublithic', 'Prod': 'production', 'Incl': 'inclusion', 'Frag': 'fragment', 'sphal': 'sphalerite', 'Iran': 'granule', 'dtrl': 'detritalus', 'wthd': 'weathered', 'WCM': 'water cut mud', 'GCW': 'gas cut water', 'Pet': 'petroleum', 'Microstyl': 'microstylolite', 'Volc': 'volcanic', 'BWPH': 'barrels of water per hour', 'thru': 'throughout', 'intcl': 'intraclasts', 'Biosp': 'biosparite', 'Girv': 'girvanella*', 'OWDD': 'oil well drilled deeper', 'consol': 'consolidated', 'Fus': 'fusulinid', 'sz': 'size', 'grysh': 'greyish', 'mrlst': 'marlstone', 'Strk': 'streak', 'SGCW': 'slight gas cut water', 'crnk': 'crinkled', 'Rbl': 'rubble', 'Fuc': 'fucoid', 'bdeye': 'birdseye', 'gyp': 'gypsumiferous', 'orng': 'orange', 'Endo': 'endothyra*', 'BHFP': 'bottom hole flow pressure', 'Xl': 'crystal', 'stn': 'staining', 'Qtz': 'quartz', 'csg': 'casing', 'Chlor': 'chlorite', 'r': 'rare', 'mky': 'milky', 'str': 'streak', 'Onc': 'oncolite', 'Btm': 'bottom', 'Slst': 'siltstone', 'dissem': 'disseminated', 'spr': 'sparry', 'Pap': 'paper', 'grnt': 'granite', 'Par': 'particle', 'ang': 'angular', 'Descr': 'description', 'intgwn': 'intergrown', 'Stylio': 'styliolina*', 'Intclas': 'intraclast', 'Rem': 'remains', 'chlor': 'chlorite', 'euhd': 'euhedral', 'Pend': 'pendularous', 'Rec': 'recovery', 'grnl': 'granule (2-4 mm)', 'Calc': 'calcite', 'p.d': 'pressure deformation', 'recem': 'recemented', 'strgr': 'stringer', 'lig': 'lignitic', 'Glauc': 'glauconite', 'intbd': 'interbedded', 'lim': 'itic', 'mic': 'micro', 'Db': 'diabase', 'Port por': 'porosity', 'Ech': 'echinoid', 'lit': 'lithic', 'Fluor': 'fluoresceince', 'vrvd': 'varved', 'Arag': 'aragonite', 'vgt': 'varigated', 'Clus': 'cluster', 'mot': 'mottled', 'surf': 'surface', 'pos': 'possibility', 'plas': 'plastic', 'pyr': 'pyritized', 'In': 'inch', 'kao': 'kaolin', 'Orbit': 'Orbitolina', 'MLL': 'microlaterolog', 'sdy': 'sandy', 'Mic': 'micaceous', 'Plt': 'plant', 'assoc': 'associated', 'rthy': 'earthy', 'Ivan': 'ivanovia*', 'suc': 'sucrosic', 'intercal': 'intercalated', 'glos': 'glossy', 'typ': 'typical', 'abv': 'above', 'MCO': 'mud cut oil', 'Unconf': 'unconformity', 'Bent': 'bentonite', 'dol': 'dolomitic', 'dom': 'dominant', 'flor': 'fluorescence', 'sltst': 'siltstone', 'brec': 'brecciated', 'Stri': 'striae', 'dism': 'disseminated', 'BHT': 'bottom hole temperature', 'exv': 'extrusive', 'Pol': 'polish', 'GTS': 'gas to surface', 'Mbr': 'member', 'exp': 'exposed', 'Pybit': 'pyrobitumen', 'a.a.': 'same as above sample', 'intlam': 'interlaminated', 'sd': 'sand (1/16-2 mm)', 'Rpl': 'ripple', 'dend': 'dendritic', 'pkr': 'packer', 'drlg': 'drilling', 'Spher': 'spherule', 'Hal': 'halitiferous', 'lstr': 'lustre', 'Exclas': 'extraclast', 'ML': 'microlog, minilog', 'anhy': 'anhydritic', 'drlr': 'driller', 'Rud': 'rudist', 'grapst': 'grapestone', 'Bubl': 'bubble', 'pris': 'prismatic', 'Bnd': 'band', 'och': 'ochre', 'spher': 'spherulitic', 'Jt': 'joint', 'occ': 'occasional', 'Wl': 'well', 'varic': 'varicolored', 'intrlam': 'interlaminated', 'O&G': 'oil and gas', 'mnrl': 'mineralized', 'Shlt por': 'shelter porosity', 'AOF': 'absolute open flow', 'Gyp': 'gypsumiferous', 'Bit': 'bitumen', 'Gast': 'gastropod', 'Pst': 'pumice-stone', 'PB': 'plugged back', 'Stromlt': 'stromatolite', 'BO': 'barrels of oil', 'dk': 'darker', 'CN': 'compensated neutron', 'KB': 'kelly bushing', 'dd': 'dead', 'Intvl': 'interval', 'stromlt': 'stromatolite', 'Fspr': 'feldspar', 'Milid': 'miliolid', 'cotg': 'coateding', 'cotd': 'coateding', 'repl': 'replacement', 'slily': 'slightly', 'struc': 'structure', 'SO&G': 'show of oil and gas', 'gen': 'generally', 'hetr': 'heterogeneous', 'crs': 'coarse', 'bar': 'baritic', 'bas': 'basaltic', '(W)': 'wildcat', 'shy': 'shaly', 'FDL': 'formation density log', 'w/o': 'without w/o', 'BW': 'barrels of water', 'rsns': 'resinous', 'PL': 'proximity log', 'gran': 'granular', 'BWPD': 'barrels of water per day', 'x': 'cross', 'grad': 'grading', 'Qtzt': 'quartzite', 'crm': 'cream', 'res': 'residuual', 'Pt': 'part', 'OTD': 'old total depth', 'contam': 'contaminated', 'S.W.C.': 'sidewall core', 'qtz': 'quartz', 'magnt': 'magnetite', 'Pel': 'pellet', 'num': 'numerous', 'sec': 'secondary', 'fnly': 'finly', 'arg': 'argillaceous', 'ark': 'arkosic', 'rmn': 'remainant', 'prim': 'primary', 'volat': 'volatile', 'SOCW': 'slight oil cut water', 'piso': 'pisolitic', 'pkish': 'pinkish', 'metaph': 'metamorphosed', 'trnsp': 'transparent', 'irr': 'irregular', 'hornbd': 'hornblend', "'' or do": 'ditto', 'biost': 'biostromal', 'gept': 'geopetal', 'trnsl': 'translucent', 'PD': 'per day', 'PH': 'per hour', 'SO&W': 'show of oil and water', 'Vnlet': 'veinlet', 'Tham': 'thamnopora*', 'wg': 'vuggy', 'C': 'coal', 'men': 'meniscus', 'exclas': 'extraclastic', 'lg': 'long', 'jt': 'jointing', 'comp': 'completion', 'gnsh': 'greenish', 'wk': 'weak', 'wi': 'with', 'wh': 'white', 'Exv': 'extrusive rock', 'S': 'sonic, acoustilog', 'magn': 'magnetic', 'gywk': 'graywacke', 'G.W.': 'granite wash', 'Satm sat': 'saturation', 'tub': 'tubular', 'tuf': 'tuffaceous', 'coq': 'coquina', 'vug': 'vugular', 'c': 'coarsely', 'fnt': 'faintly', 'dkr': 'darker', 'cov': 'covered', 'conch': 'conchoidal', 'intgran': 'intergranular', 'SI': 'shut in', 'Chit': 'chitinous', 's': 'small', 'Meta': 'metamorphic rock', 'Sst': 'sandstone', 'brak': 'brackish', 'uni': 'uniform', 'com': 'common', 'cotd gn': 'coated grains', 'Het': 'Heterostegina', 'foram': 'foraminiferal', 'chky': 'chalky', 'cl': 'clastic', 'cb': 'carbonized', 'Glas': 'glass', 'rad': 'radiating', 'poly': 'polygonal', 'hvy': 'heavy', 'pol': 'polished', 'Pent': 'pentamerus*', 'Hem': 'hematite', 'tgh': 'tough', 'cp': 'compare', 'Surf': 'surface', 'ps': 'pseudo-', 'frg': 'fringing', 'pt': 'partly', 'Bur': 'burrow', 'fri': 'friable', 'blsh': 'bluish', 'pch': 'patchy', 'stmg': 'streaming', 'frs': 'fresh', 'skel': 'skeletal', 'pyrbit': 'pyrobitumen', 'pk': 'pink', 'devit': 'devitrified', 'authg': 'authigenic', 'pl': 'plant', 'cly': 'clayey', 't.b.': 'thin-bedded', 'Tr': 'trace', 'IES': 'induction electric', 'WAB': 'weak air blow', 'lge': 'large', 'spic': 'spicular', 'psi': 'pounds per square inch', 'crbnt': 'carbonate', 'Tp': 'top', 'Pkst': 'packstone', 'Sedm': 'sediment', 'Cont': 'content', 'xbdg': 'cross-bedding', 'gns': 'gneiss', 'micgr': 'micrograined', 'Cl': 'clay', 'slt': 'silt', 'vcol': 'varicolored', 'undly': 'underlying', 'grnt.w': 'granite wash', 'n': 'no, none, non', 'contm': 'contaminated', 'sln': 'solution', 'rbl': 'rubblbly', 'fuc': 'fucoidal', 'Intr': 'intrusive', 'cvg': 'cavings', 'k': 'permeabilityable', 'slb': 'slabby', 'FTAB': 'faint air blow', 'DIL': 'dual induction laterolog', 'pyrcl': 'pyroclastic', 'cons': 'considerable', 'rndd': 'rounded', 'bot': 'botryoidal', 'Sel': 'selenite', 'vn': 'vein', 'Tf': 'tuffaceous', 'styl': 'stylotitic', 'conc': 'concretionary', 'mott': 'mottled', 'xlam': 'cross-laminated', 'x-strat': 'cross-stratified', 'Strom': 'stromatoporoid', 'ig': 'igneous', 'pap': 'papery', 'incr': 'increasing', 'litt': 'littoral', 'intstl': 'interstitial', 'bioturb': 'bioturbated', 'PPM': 'parts per million', 'GAP': 'good air blow', 'Repl': 'replaced', 'lith': 'lithographic', 'elong': 'elongate', 'Chara': 'charophytes', 'sat': 'saturated', 'incl': 'inclusion', 'Coq': 'coquina', 'Vug': 'vug', 'Cor': 'coral', 'intst': 'intersticitial', 'cncn': 'concentric', 'rng': 'range', 'orth': 'orthoclase', 'rdsh': 'redish', 'syn': 'syntaxial', 'Microfos': 'microfossilferous', 'phr': 'phreatic', 'Wkst': 'wackestone', 'pisol': 'pisolitic', 'Col': 'color', 'Jasp': 'jasper', 'Mat': 'material', 'Mbl': 'marble', 'intxl': 'intercrystalline', 'detr': 'detrital', 'sed': 'sedimentary', 'x-bd': 'cross-bedded', 'gsy': 'grasy', 'OWPB': 'oil well plugged back', 'min': 'mineralized', 'Sol': 'Soution', 'Vn': 'vein', 'col': 'colored', 'x-lam': 'cross-laminated', 'thn': 'thin', 'thk': 'thick', 'fltg': 'floating', 'Ig': 'igneous rock', 'imbd': 'imbedded', 'ck': 'choke', 'BHP': 'bottom hole pressure', 'yel': 'yellow', 'Orth': 'orthoclase', 'sptd': 'spottedy', 'spty': 'spottedy', 'Sphaer': 'sphaerocodium*', 'Pbl': 'pebble', 'intfrag': 'interfragmental', 'Scaph': 'scaphopod', 'resd': 'residual', 'Bd': 'bed', 'Fe-mag': 'ferro-magnesian', 'sks': 'slickensided', 'rexlzd': 'recrystallized', 'Bm': 'basement', 'Glob': 'globigerina*', 'elg': 'elongate', 'unident': 'unidentifiable', 'Fau': 'fauna', 'Gal': 'galeolaria*', 'cub': 'cubic', 'Fac': 'facet', 'Glos': 'gloss', 'Gab': 'gabbro', 'bnd': 'banded', 'Oo': 'ooid', 'Gns': 'gneiss', 'amb': 'amber', 'strat': 'strataified', 'amm': 'ammonite', 'vis': 'visible', 'mos': 'mosaic', 'por': 'poroussity', 'uncons.': 'unconsolidated', 'embd': 'embedded', 'Dia': 'diameter', 'rnd': 'rounded', 'sbang': 'subangular', 'cntr': 'centered', 'mol': 'moldic', 'Cvg': 'caving', 'bit': 'bitumeninous', 'Micropor': 'micropore', 'S.W.': 'salt water', 'amt': 'amount', 'mod': 'moderate', 'Crin': 'crinoidal', 'Lyr': 'layer', 'brn': 'brown', 'boudg': 'boudinage', 'OC': 'oil cut', 'Microol': 'micro-oolite', 'Ltl': 'little', 'V.op': 'valve open', 'amor': 'amorphous', 'Ark': 'arkose', 'clas': 'clastic', 'Psool': 'pseudo oolite', 'Strat': 'strata', 'Bdg': 'bedding', 'bri': 'bright', 'drsy': 'drusy', 'tstg': 'testing', 'Scol': 'scolecodonts', 'crpld': 'crumpled', 'SOCM': 'slight oil cut mud', 'rec': 'recovered', 'Fvst': 'favosites*', 'sbrndd': 'sub rounded', 'p.p.': 'pin-poin', 'HO': 'heavy oil', 'olv': 'olive', 'Mdst': 'mudstone', 's & p': 'salt and pepper', 'venn': 'vermillon', 'u': 'upper', 'gvl': 'gravel', 'fos': 'fossiliferous', 'pet': 'petroleumiferous', 'Clclt': 'calcilutite', 'OSR': 'oil source rock', 'Len': 'lens', 'pel': 'pellet', 'circ': 'circulation', 'prom': 'prominently', 'fol': 'foliated', 'peld': 'pelletoidal', 'Tent': 'tentaculites*', 'prob': 'probably', 'bd': 'bed', 'flky': 'flaky', 'bf': 'buff', 'LL8': 'laterolog-8', 'cgl': 'conglomerate', 'Gwke': 'graywacke', 'slky': 'silky', 'Grap': 'graptolite', 'Bor': 'bored', 'bo': 'bophaceous', 'Bot': 'botryoid', 'phen': 'phenocrysts', 'posa': 'possible', 'bu': 'buff', 'sel': 'selenite', 'Gran': 'granule', 'r.f.p': 'rounded frosted pitted', 'Clslt': 'calcisiltite', 'ss': 'sandstone', 'Uc': 'underclay', 'oo': 'ooidal', 'GOR': 'gas-to-oil ratio', 'Clcrd': 'calcirudite', 'euhed': 'euhedral', 'Solen': 'solenopora*', 'Euryamph': 'euryamphipora*', 'Ptg': 'parting', 'od': 'odor', 'eqnt': 'equant', 'Ost': 'ostracod', 'o': 'oil', 'Ceph': 'cephalopod', 'irid': 'iridescent', 'ox': 'oxidized', 'Casph': 'calcisphaera*', 'ti': 'tight', 'Foss': 'fossiliferous', 'sept': 'septate', 'marn': 'marine', 'Scs': 'scarce', 'op': 'open', 'chk': 'chalky', 'ahd': 'anhedral', 'Fen': 'fenestra', 'qtzc': 'quartzitic', 'sol': 'solitary', 'corln': 'coralline', 'mas': 'massive', 'ferr': 'ferruginous', 'Cub': 'cube', 'intr': 'intrusionive', 'Chaet': 'chaetetes*', 'OCM': 'oil cut mud', 'qtzs': 'quartzose', 'qtzt': 'quartzite', 'GIP': 'good initial puff', 'loc': 'location', 'blky': 'blocky', 'phyl': 'phyllitic', 'vrtl': 'vertical', 'Ooc': 'oolicast', 'anhed': 'anhedral', 'aren': 'arenaceous', 'Ool': 'oolite', 'bioh': 'biohermal', 'diagn': 'diagenesisetic', 'Pelec': 'pelecypod', 'biot': 'biotite', 'abnt': 'abundant', 'var': 'variable', 'hem': 'hematitic', 'gil': 'gilsonite', 'calctc': 'calcitic', 'clyst': 'claystone', 'fl/': 'flowing', 'bur': 'burrowed', 'ea': 'earthy', 'micr': 'micritic', 'grdg': 'grading', 'Flk': 'flake', 'hi': 'high', 's&p': 'salt & pepper', 'Flo': 'flora', 'Deb': 'debris', 'Splin': 'splintery', 'hd': 'hard', 'coqid': 'coquinaoid', 'Grst': 'grainstone', 'Phyl': 'phyllite', 'med.': 'medium', 'microxln': 'microcrystalline', 'Plag': 'plagioclase', 'mat': 'material, matter', 'bldr': 'boulder', 'up': 'upper', 'Tex': 'texture', 'mdy': 'muddy', 'WIP': 'weak initial puff', 'olvn': 'olivine', 'ab': 'above', 'mrl': 'marly', 'orsh': 'orangish', 'Allo': 'allochem', 'm': 'medium', 'deer': 'decreasing', 'chal': 'chalcedony', 'ovgth': 'overgrowth', 'ap': 'appears', 'Sid': 'siderite', 'ind': 'indurated', 'pit': 'pitted', 'trip': 'tripolic', 'DL': 'density log', 'Sil': 'silica', 'Lam': 'laminations', 'F.Q.G.': 'frosted quartz grains', 'Cgl': 'conglomerate', 'hex': 'hexagonal', 'wxy': 'waxy', 'gry': 'greyish', 'Fe-st': 'ironstone', 'oom': 'oomoldic', 'grd': 'graded', 'BHSIP': 'bottom hole shut in pressure', 'Mag': 'magnetite', 'ool': 'oolitic', 'Lat': 'laterite', 'mass': 'massive', 'T.D.': 'total depth', 'Stn': 'stain', 'sqz': 'squeezed', 'E': 'east', 'BOPD': 'barrels of oil per day', 'aglm': 'agglomerate', 'evap': 'evapourititic', 'fluor': 'fluoresceincent', 'est': 'estimated', 'Su': 'sulphur', 'lent': 'lenticular', 'stal': 'stalactitic', 'N': 'Neutron', 'wtr': 'water', 'dolst': 'dolostone', 'bcm': 'becoming', 'OTS': 'oil to surface', 'SAB': 'strong air blow', 'Bry': 'bryozoa', 'O-Qtz': 'orthoquartzite', 'Styl': 'stylolite', 'crinal': 'crinoidal', 'Brk': 'break', 'mrly': 'marly', 'Av': 'average', 'xbd': 'cross-bedded', 'Radax': 'radiaxial', 'swbd': 'swabbed'}}
lexicon.synonyms
{'Overburden': ['Drift'],
'Anhydrite': ['Gypsum'],
'Salt': ['Halite', 'Sylvite']}
Most of the lexicon works ‘behind the scenes’ when processing descriptions into Rock
components.
lexicon.find_synonym('Halite')
'salt'
s = "grysh gn ss w/ sp gy sh"
lexicon.expand_abbreviations(s)
'greyish green sandstone with spotty gray shale'
Component#
A set of attributes. All are optional.
from striplog import Component
print(Component.__doc__)
Initialize with a dictionary of properties. You can use any
properties you want e.g.:
- lithology: a simple one-word rock type
- colour, e.g. 'grey'
- grainsize or range, e.g. 'vf-f'
- modifier, e.g. 'rippled'
- quantity, e.g. '35%', or 'stringers'
- description, e.g. from cuttings
We define a new rock with a Python dict
object:
r = {'colour': 'grey',
'grainsize': 'vf-f',
'lithology': 'sand'}
rock = Component(r)
rock
colour | grey |
grainsize | vf-f |
lithology | sand |
The Rock has a colour:
rock['colour']
'grey'
And it has a summary, which is generated from its attributes.
rock.summary()
'Grey, vf-f, sand'
We can format the summary if we wish:
rock.summary(fmt="My rock: {lithology} ({colour}, {grainsize!u})")
'My rock: sand (grey, VF-F)'
The formatting supports the usual s
, r
, and a
:
s
:str
r
:repr
a
:ascii
Also some string functions:
u
:str.upper
l
:str.lower
c
:str.capitalize
t
:str.title
And some numerical ones, for arrays of numbers:
+
or∑
:np.sum
m
orµ
:np.mean
v
:np.var
d
:np.std
x
:np.product
x = {'colour': ['Grey', 'Brown'],
'bogosity': [0.45, 0.51, 0.66],
'porosity': [0.2003, 0.1998, 0.2112, 0.2013, 0.1990],
'grainsize': 'VF-F',
'lithology': 'Sand',
}
X = Component(x)
# This is not working at the moment.
#fmt = 'The {colour[0]!u} {lithology!u} has a total of {bogosity!∑:.2f} bogons'
#fmt += 'and a mean porosity of {porosity!µ:2.0%}.'
fmt = 'The {lithology!u} is {colour[0]!u}.'
X.summary(fmt)
'The SAND is GREY.'
X.json()
'{"colour": ["Grey", "Brown"], "bogosity": [0.45, 0.51, 0.66], "porosity": [0.2003, 0.1998, 0.2112, 0.2013, 0.199], "grainsize": "VF-F", "lithology": "Sand"}'
We can compare rocks with the usual ==
operator:
rock2 = Component({'grainsize': 'VF-F',
'colour': 'Grey',
'lithology': 'Sand'})
rock == rock2
True
rock
colour | grey |
grainsize | vf-f |
lithology | sand |
In order to create a Component object from text, we need a lexicon to compare the text against. The lexicon describes the language we want to extract, and what it means.
rock3 = Component.from_text('Grey fine sandstone.', lexicon)
rock3
lithology | sandstone |
grainsize | fine |
colour | grey |
Components support double-star-unpacking:
"My rock: {lithology} ({colour}, {grainsize})".format(**rock3)
'My rock: sandstone (grey, fine)'
Position#
Positions define points in the earth, like a top, but with uncertainty. You can define:
upper
— the highest possible locationmiddle
— the most likely locationlower
— the lowest possible locationunits
— the units of measurementx
andy
— the x and y location (these don’t have uncertainty, sorry)meta
— a Python dictionary containing anything you want
Positions don’t have a ‘way up’.
from striplog import Position
print(Position.__doc__)
Used to represent a position: a top or base.
Not sure whether to go with upper-middle-lower or z_max, z_mid, z_min.
Sticking to upper and lower, because ordering in Intervals is already
based on 'above' and 'below'.
params = {'upper': 95,
'middle': 100,
'lower': 110,
'meta': {'kind': 'erosive', 'source': 'DOE'}
}
p = Position(**params)
p
upper | 95.0 |
middle | 100.0 |
lower | 110.0 |
Even if you don’t give a middle
, you can always get z
: the central, most likely position:
params = {'upper': 75, 'lower': 85}
p = Position(**params)
p
upper | 75.0 |
middle | |
lower | 85.0 |
p.z
80.0
Interval#
Intervals are where it gets interesting. An interval can have:
a top
a base
a description (in natural language)
a list of
Component
s
Intervals don’t have a ‘way up’, it’s implied by the order of top
and base
.
from striplog import Interval
print(Interval.__doc__)
Used to represent a lithologic or stratigraphic interval, or single point,
such as a sample location.
Initialize with a top (and optional base) and a description and/or
an ordered list of components.
Args:
top (float): Required top depth. Required.
base (float): Base depth. Optional.
description (str): Textual description.
lexicon (dict): A lexicon. See documentation. Optional unless you only
provide descriptions, because it's needed to extract components.
max_component (int): The number of components to extract. Default 1.
abbreviations (bool): Whether to parse for abbreviations.
TODO:
Seems like I should be able to instantiate like this:
``Interval({'top': 0, 'components':[Component({'age': 'Neogene'})``s
I can get around it for now like this:
``Interval(**{'top': 0, 'components':[Component({'age': 'Neogene'})``
Question: should Interval itself cope with only being handed 'top' and
either fill in down to the next or optionally create a point?
I might make an Interval
explicitly from a Component…
Interval(10, 20, components=[rock])
top | 10.0 | ||||||
primary |
| ||||||
summary | 10.00 m of grey, vf-f, sand | ||||||
description | |||||||
data | |||||||
base | 20.0 |
… or I might pass a description and a lexicon
and Striplog will parse the description and attempt to extract structured Component
objects from it.
Interval(20, 40, "Grey sandstone with shale flakes.", lexicon=lexicon).__repr__()
"Interval({'top': Position({'middle': 20.0, 'units': 'm'}), 'base': Position({'middle': 40.0, 'units': 'm'}), 'description': 'Grey sandstone with shale flakes.', 'data': {}, 'components': [Component({'lithology': 'sandstone', 'colour': 'grey'})]})"
Notice I only got one Component
, even though the description contains a subordinate lithology. This is the default behaviour, we have to ask for more components:
interval = Interval(20, 40, "Grey sandstone with black shale flakes.", lexicon=lexicon, max_component=2)
print(interval)
{'top': Position({'middle': 20.0, 'units': 'm'}), 'base': Position({'middle': 40.0, 'units': 'm'}), 'description': 'Grey sandstone with black shale flakes.', 'data': {}, 'components': [Component({'lithology': 'sandstone', 'colour': 'grey'}), Component({'lithology': 'shale', 'amount': 'flakes', 'colour': 'black'})]}
Interval
s have a primary
attribute, which holds the first component, no matter how many components there are.
interval.primary
lithology | sandstone |
colour | grey |
Ask for the summary to see the thickness and a Rock
summary of the primary component. Note that the format code only applies to the Rock
part of the summary.
interval.summary(fmt="{colour} {lithology}")
'20.00 m of grey sandstone with black shale'
We can change an interval’s properties:
interval.top = 18
interval
top | 18.0 | ||||
primary |
| ||||
summary | 22.00 m of sandstone, grey with shale, flakes, black | ||||
description | Grey sandstone with black shale flakes. | ||||
data | |||||
base | 40.0 |
interval.top
upper | 18.0 |
middle | 18.0 |
lower | 18.0 |
Comparing and combining intervals#
# Depth ordered
i1 = Interval(top=61, base=62.5, components=[Component({'lithology': 'limestone'})])
i2 = Interval(top=62, base=63, components=[Component({'lithology': 'sandstone'})])
i3 = Interval(top=62.5, base=63.5, components=[Component({'lithology': 'siltstone'})])
i4 = Interval(top=63, base=64, components=[Component({'lithology': 'shale'})])
i5 = Interval(top=63.1, base=63.4, components=[Component({'lithology': 'dolomite'})])
# Elevation ordered
i8 = Interval(top=200, base=100, components=[Component({'lithology': 'sandstone'})])
i7 = Interval(top=150, base=50, components=[Component({'lithology': 'limestone'})])
i6 = Interval(top=100, base=0, components=[Component({'lithology': 'siltstone'})])
i2.order
'depth'
Technical aside: The Interval
class is a functools.total_ordering
, so providing __eq__
and one other comparison (such as __lt__
) in the class definition means that instances of the class have implicit order. So you can use sorted
on a Striplog, for example.
It wasn’t clear to me whether this should compare tops (say), so that ‘>’ might mean ‘above’, or if it should be keyed on thickness. I chose the former, and implemented other comparisons instead.
print(i3 == i2) # False, they don't have the same top
print(i1 > i4) # True, i1 is above i4
print(min(i1, i2, i5).summary()) # 0.3 m of dolomite
False
True
0.30 m of dolomite
i2 > i4 > i5 # True
True
We can combine intervals with the +
operator. (However, you cannot subtract intervals.)
i2 + i3
top | 62.0 | ||
primary |
| ||
summary | 1.50 m of sandstone with siltstone | ||
description | 50.0% 1.00 m of siltstone with 50.0% 1.00 m of sandstone | ||
data | |||
base | 63.5 |
Adding a rock adds a (minor) component and adds to the description.
interval + rock3
top | 18.0 | ||||
primary |
| ||||
summary | 22.00 m of sandstone, grey with shale, flakes, black with sandstone, fine, grey | ||||
description | Grey sandstone with black shale flakes. with Sandstone, fine, grey | ||||
data | |||||
base | 40.0 |
i6.relationship(i7), i5.relationship(i4)
('partially', 'containedby')
print(i1.partially_overlaps(i2)) # True
print(i2.partially_overlaps(i3)) # True
print(i2.partially_overlaps(i4)) # False
print()
print(i6.partially_overlaps(i7)) # True
print(i7.partially_overlaps(i6)) # True
print(i6.partially_overlaps(i8)) # False
print()
print(i5.is_contained_by(i3)) # True
print(i5.is_contained_by(i4)) # True
print(i5.is_contained_by(i2)) # False
True
True
False
True
True
False
True
True
False
x = i4.merge(i5)
x[-1].base = 65
x
Striplog(3 Intervals, start=63.0, stop=65.0)
i1.intersect(i2, blend=False)
top | 62.0 | ||
primary |
| ||
summary | 0.50 m of sandstone | ||
description | |||
data | |||
base | 62.5 |
i1.intersect(i2)
top | 62.0 | ||
primary |
| ||
summary | 0.50 m of limestone with sandstone | ||
description | 60.0% 1.50 m of limestone with 40.0% 1.00 m of sandstone | ||
data | |||
base | 62.5 |
i1.union(i3)
top | 61.0 | ||
primary |
| ||
summary | 2.50 m of limestone with siltstone | ||
description | 60.0% 1.50 m of limestone with 40.0% 1.00 m of siltstone | ||
data | |||
base | 63.5 |
i3.difference(i5)
(Interval({'top': Position({'middle': 62.5, 'units': 'm'}), 'base': Position({'middle': 63.1, 'units': 'm'}), 'description': '', 'data': {}, 'components': [Component({'lithology': 'siltstone'})]}),
Interval({'top': Position({'middle': 63.4, 'units': 'm'}), 'base': Position({'middle': 63.5, 'units': 'm'}), 'description': '', 'data': {}, 'components': [Component({'lithology': 'siltstone'})]}))
©2015 Agile Geoscience. Licensed CC-BY. striplog.py