Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation
Journal of Biomedical Informatics,
Год журнала:
2024,
Номер
155, С. 104659 - 104659
Опубликована: Май 21, 2024
Язык: Английский
Pheno-Ranker: a toolkit for comparison of phenotypic data stored in GA4GH standards and beyond
BMC Bioinformatics,
Год журнала:
2024,
Номер
25(1)
Опубликована: Дек. 4, 2024
Abstract
Background
Phenotypic
data
comparison
is
essential
for
disease
association
studies,
patient
stratification,
and
genotype–phenotype
correlation
analysis.
To
support
these
efforts,
the
Global
Alliance
Genomics
Health
(GA4GH)
established
Phenopackets
v2
Beacon
standards
storing,
sharing,
discovering
genomic
phenotypic
data.
These
provide
a
consistent
framework
organizing
biological
data,
simplifying
their
transformation
into
computer-friendly
formats.
However,
matching
participants
using
GA4GH-based
formats
remains
challenging,
as
current
methods
are
not
fully
compatible,
limiting
effectiveness.
Results
Here,
we
introduce
Pheno-Ranker,
an
open-source
software
toolkit
individual-level
of
As
input,
it
accepts
JSON/YAML
exchange
from
models,
well
any
structure
encoded
in
JSON,
YAML,
or
CSV
Internally,
hierarchical
flattened
to
one
dimension
then
transformed
through
one-hot
encoding.
This
allows
efficient
pairwise
(all-to-all)
comparisons
within
cohorts
patient’s
profile
cohorts.
Users
have
flexibility
refine
by
including
excluding
terms,
applying
weights
variables,
obtaining
statistical
significance
Z-scores
p
-values.
The
output
consists
text
files,
which
can
be
further
analyzed
unsupervised
learning
techniques,
such
clustering
multidimensional
scaling
(MDS),
with
graph
analytics.
Pheno-Ranker’s
performance
has
been
validated
simulated
synthetic
showing
its
accuracy,
robustness,
efficiency
across
various
health
scenarios.
A
real
use
case
PRECISESADS
study
highlights
practical
utility
clinical
research.
Conclusions
Pheno-Ranker
user-friendly,
lightweight
semantic
similarity
analysis
formats,
extendable
other
types.
It
enables
wide
range
variables
beyond
HPO
OMIM
terms
while
preserving
full
context.
designed
command-line
tool
additional
utilities
import,
simulation,
summary
statistics
plotting,
QR
code
generation.
For
interactive
analysis,
also
includes
web-based
user
interface
built
R
Shiny.
Links
online
documentation,
Google
Colab
tutorial,
tool’s
source
available
on
project
home
page:
https://github.com/CNAG-Biomedical-Informatics/pheno-ranker
.
Язык: Английский
Cross-Standard Health Data Harmonization using Semantics of Data Elements
Scientific Data,
Год журнала:
2024,
Номер
11(1)
Опубликована: Дек. 19, 2024
Faced
with
heterogeneity
of
healthcare
data,
we
propose
a
novel
approach
for
harmonizing
data
elements
(i.e.,
attributes)
across
health
standards.
This
focuses
on
the
implicit
concept
that
is
represented
by
element.
The
process
includes
following
steps:
identifying
concepts,
clustering
similar
concepts
and
constructing
mappings
between
clusters
using
Simple
Standard
Sharing
Ontological
Mappings
(SSSOM)
Resource
Description
Framework
(RDF),
enabling
creation
reusable
mappings.
As
proof-of-concept,
applied
to
five
common
standards
-
HL7
FHIR,
OMOP,
CDISC,
Phenopackets,
openEHR,
four
domains,
such
as
demographics
diagnoses,
nine
topics
within
those
gender
vital
status.
These
domains
are
selected
represent
broader
range
in
field.
For
each
topic,
were
found
after
thorough
search,
resulting
analysis
64
elements,
identification
their
underlying
development
Three
use
cases
implemented
demonstrate
role
element
harmonization
querying
at
varying
levels
granularity.
helps
overcome
limitations
context-dependent
provides
valuable
insight
mapping
practice
domain.
Язык: Английский
Enhancing Semantic Interoperability in Precision Medicine: Converting OMOP CDM to Beacon v2 in the Spanish IMPaCT- Data Project
medRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2024,
Номер
unknown
Опубликована: Дек. 28, 2024
Abstract
Objective
To
introduce
novel
methods
to
convert
OMOP
CDM
data
into
GA4GH
Beacon
v2
format,
enhancing
semantic
interoperability
within
Spain’s
IMPaCT-Data
program
for
personalized
medicine.
Materials
and
Methods
We
utilized
a
file-based
approach
with
the
Convert-Pheno
tool
transform
exports
format.
Additionally,
we
developed
direct
connection
from
PostgreSQL
API,
enabling
real-time
access
without
intermediary
text
files.
Results
successfully
converted
datasets
three
research
centers
(CNAG,
IIS
La
Fe,
HMar)
format
nearly
100%
completeness.
The
improved
freshness
adaptability
dynamic
environments.
Discussion
Conclusion
This
study
introduces
two
methodologies
integrating
v2,
offering
performance
optimization
or
access.
These
can
be
adopted
by
other
enhance
collaboration
in
health
sharing.
Язык: Английский