Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Sept. 18, 2023
Abstract
Given
a
user
dataset
U
and
an
object
I,
kNN
join
query
in
high-dimensional
space
returns
the
k
nearest
neighbors
of
each
from
I.
The
is
basic
necessary
operation
many
applications,
such
as
databases,
data
mining,
computer
vision,
multi-media,
machine
learning,
recommenda-tion
systems,
more.
In
real
world,
datasets
frequently
update
dynamically
objects
are
added
or
removed.
this
paper,
we
propose
novel
methods
continuous
over
dynamic
data.
We
firstly
HDR+
Tree,
which
supports
more
efficient
insertion,
deletion,
batch
update.
Further
observed
that
existing
rely
on
globally
correlated
for
effec-tive
dimensionality
reduction,
then
HDR
Forest.
It
clusters
constructs
multiple
Trees
to
capture
local
correlations
among
As
result,
our
Forest
able
process
non-globally
efficiently.
Two
optimisations
applied
proposed
Forest,
including
precomputation
PCA
states
items
pruning-based
recomputation
during
item
deletion.
For
completeness
work,
also
present
proof
computing
distances
reduced
dimensions
Tree.
Extensive
experiments
real-world
show
outperform
baseline
algorithms
naive
RkNN
Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Aug. 24, 2023
Abstract
Given
a
user
dataset
U
and
an
object
I,
kNN
join
query
in
high-dimensional
space
returns
the
k
nearest
neighbors
of
each
from
I.
The
is
basic
necessary
operation
many
applications,
such
as
databases,
data
mining,
computer
vision,
multi-media,
machine
learning,
recommendation
systems,
more.
In
real
world,
datasets
frequently
update
dynamically
objects
are
added
or
removed.
this
paper,
we
propose
novel
methods
continuous
over
dynamic
highdimensional
data.
We
firstly
HDR+
Tree,
which
supports
more
efficient
insertion,
deletion,
batch
update.
Further
observed
that
existing
rely
on
globally
correlated
for
effective
dimensionality
reduction,
then
HDR
Forest.
It
clusters
constructs
multiple
Trees
to
capture
local
correlations
among
As
result,
our
Forest
able
process
non-globally
efficiently.
Two
optimisations
applied
proposed
Forest,
including
precomputation
PCA
states
items
pruning-based
recomputation
during
item
deletion.
For
completeness
work,
also
present
proof
computing
distances
reduced
dimensions
Tree.
Extensive
experiments
realworld
show
outperform
baseline
algorithms
naive
RkNN
Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Aug. 25, 2023
Abstract
Given
a
user
dataset
U
and
an
object
I,
kNN
join
query
in
high-dimensional
space
returns
the
k
nearest
neighbors
of
each
from
I.
The
is
basic
necessary
operation
many
applications,
such
as
databases,
data
mining,
computer
vision,
multi-media,
machine
learning,
recommendation
systems,
more.
In
real
world,
datasets
frequently
update
dynamically
objects
are
added
or
removed.
this
paper,
we
propose
novel
methods
continuous
over
dynamic
highdimensional
data.
We
firstly
HDR+
Tree,
which
supports
more
efficient
insertion,
deletion,
batch
update.
Further
observed
that
existing
rely
on
globally
correlated
for
effective
dimensionality
reduction,
then
HDR
Forest.
It
clusters
constructs
multiple
Trees
to
capture
local
correlations
among
As
result,
our
Forest
able
process
non-globally
efficiently.
Two
optimisations
applied
proposed
Forest,
including
precomputation
PCA
states
items
pruning-based
recomputation
during
item
deletion.
For
completeness
work,
also
present
proof
computing
distances
reduced
dimensions
Tree.
Extensive
experiments
realworld
show
outperform
baseline
algorithms
naive
RkNN
Research Square (Research Square),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Sept. 18, 2023
Abstract
Given
a
user
dataset
U
and
an
object
I,
kNN
join
query
in
high-dimensional
space
returns
the
k
nearest
neighbors
of
each
from
I.
The
is
basic
necessary
operation
many
applications,
such
as
databases,
data
mining,
computer
vision,
multi-media,
machine
learning,
recommenda-tion
systems,
more.
In
real
world,
datasets
frequently
update
dynamically
objects
are
added
or
removed.
this
paper,
we
propose
novel
methods
continuous
over
dynamic
data.
We
firstly
HDR+
Tree,
which
supports
more
efficient
insertion,
deletion,
batch
update.
Further
observed
that
existing
rely
on
globally
correlated
for
effec-tive
dimensionality
reduction,
then
HDR
Forest.
It
clusters
constructs
multiple
Trees
to
capture
local
correlations
among
As
result,
our
Forest
able
process
non-globally
efficiently.
Two
optimisations
applied
proposed
Forest,
including
precomputation
PCA
states
items
pruning-based
recomputation
during
item
deletion.
For
completeness
work,
also
present
proof
computing
distances
reduced
dimensions
Tree.
Extensive
experiments
real-world
show
outperform
baseline
algorithms
naive
RkNN