diff --git a/joss.05664/10.21105.joss.05664.crossref.xml b/joss.05664/10.21105.joss.05664.crossref.xml
new file mode 100644
index 0000000000..9517dc05ee
--- /dev/null
+++ b/joss.05664/10.21105.joss.05664.crossref.xml
@@ -0,0 +1,238 @@
+
+
+
+ 20230825T030514-25c7c9390bb5eb927c063a5392d35eb54e671061
+ 20230825030514
+
+ JOSS Admin
+ admin@theoj.org
+
+ The Open Journal
+
+
+
+
+ Journal of Open Source Software
+ JOSS
+ 2475-9066
+
+ 10.21105/joss
+ https://joss.theoj.org
+
+
+
+
+ 08
+ 2023
+
+
+ 8
+
+ 88
+
+
+
+ dcTensor: An R package for discrete matrix/tensor
+decomposition
+
+
+
+ Koki
+ Tsuyuzaki
+ https://orcid.org/0000-0003-3797-2148
+
+
+
+ 08
+ 25
+ 2023
+
+
+ 5664
+
+
+ 10.21105/joss.05664
+
+
+ http://creativecommons.org/licenses/by/4.0/
+ http://creativecommons.org/licenses/by/4.0/
+ http://creativecommons.org/licenses/by/4.0/
+
+
+
+ Software archive
+ 10.5281/zenodo.8275544
+
+
+ GitHub review issue
+ https://github.com/openjournals/joss-reviews/issues/5664
+
+
+
+ 10.21105/joss.05664
+ https://joss.theoj.org/papers/10.21105/joss.05664
+
+
+ https://joss.theoj.org/papers/10.21105/joss.05664.pdf
+
+
+
+
+
+ Binary matrix factorization with
+applications
+ Zhang
+ ICDM 2007
+ 10.1109/icdm.2007.99
+ 2007
+ Zhang, Z., Li, T., Ding, C., &
+Zhang, X. (2007). Binary matrix factorization with applications. ICDM
+2007, 391–400.
+https://doi.org/10.1109/icdm.2007.99
+
+
+ Probabilistic non-negative matrix
+factorization with binary components
+ Ma
+ MDPI mathematics
+ 10.3390/math9111189
+ 2021
+ Ma, X., Gao, J., Liu, X., Zhang, T.,
+& Tang, Y. (2021). Probabilistic non-negative matrix factorization
+with binary components. MDPI Mathematics, 1189.
+https://doi.org/10.3390/math9111189
+
+
+ Nonnegative matrix and tensor
+factorizations
+ Cichocki
+ 2009
+ Cichocki, A., Zdunek, R., Phan, A.
+H., & Amari, S. (2009). Nonnegative matrix and tensor
+factorizations. Wiley.
+
+
+ Non-negative tensor factorization using alpha
+and beta divergence
+ Cichocki
+ ICASSP ’07
+ 10.1109/icassp.2007.367106
+ 2007
+ Cichocki, A., Zdunek, R., Choi, S.,
+Plemmons, R., & Amari, S. (2007). Non-negative tensor factorization
+using alpha and beta divergence. ICASSP ’07, III-1393-III-1396.
+https://doi.org/10.1109/icassp.2007.367106
+
+
+ Nonnegative tucker
+decomposition
+ Kim
+ IEEE CVPR
+ 10.1109/cvpr.2007.383405
+ 2007
+ Kim, Y.-D., & Choi, S. (2007).
+Nonnegative tucker decomposition. IEEE CVPR, 1–8.
+https://doi.org/10.1109/cvpr.2007.383405
+
+
+ Learning the parts of objects by non-negative
+matrix factorization
+ Lee
+ Nature
+ 401
+ 10.1038/44565
+ 1999
+ Lee, D., & Seung, H. (1999).
+Learning the parts of objects by non-negative matrix factorization.
+Nature, 401, 788–791.
+https://doi.org/10.1038/44565
+
+
+ Benchmarking principal component analysis for
+large-scale single-cell RNA-sequencing
+ Tsuyuzaki
+ BMC Genome Biology
+ 21(1)
+ 10.1186/s13059-019-1900-3
+ 2020
+ Tsuyuzaki, K., Sato, H., Sato, K.,
+& Nikaido, I. (2020). Benchmarking principal component analysis for
+large-scale single-cell RNA-sequencing. BMC Genome Biology, 21(1), 9.
+https://doi.org/10.1186/s13059-019-1900-3
+
+
+ Extracting gene expression profiles common to
+colon and pancreatic adenocarcinoma using simultaneous nonnegative
+matrix factorization
+ Badea
+ Pacific Symposium on
+Biocomputing
+ 10.1142/9789812776136_0027
+ 2008
+ Badea, L. (2008). Extracting gene
+expression profiles common to colon and pancreatic adenocarcinoma using
+simultaneous nonnegative matrix factorization. Pacific Symposium on
+Biocomputing, 279–290.
+https://doi.org/10.1142/9789812776136_0027
+
+
+ Discovery of multi-dimensional modules by
+integrative analysis of cancer genomic data
+ Zhang
+ Nucleic Acids Research
+ 40(19)
+ 10.1093/nar/gks725
+ 2012
+ Zhang, C.-C., S. Liu, Li, W., Shen,
+H., Laird, P. W., & Zhou, X. J. (2012). Discovery of
+multi-dimensional modules by integrative analysis of cancer genomic
+data. Nucleic Acids Research, 40(19), 9379–9391.
+https://doi.org/10.1093/nar/gks725
+
+
+ Probabilistic latent tensor
+factorization
+ Yilmaz
+ IVA/ICA 2010
+ 10.1007/978-3-642-15995-4_43
+ 2010
+ Yilmaz, Y. K. (2010). Probabilistic
+latent tensor factorization. IVA/ICA 2010, 346–353.
+https://doi.org/10.1007/978-3-642-15995-4_43
+
+
+ A non-negative matrix factorization method
+for detecting modules in heterogeneous omics multi-modal
+data
+ Yang
+ Bioinformatics
+ 32(1)
+ 10.1093/bioinformatics/btv544
+ 2016
+ Yang, Z., & Michailidis, G.
+(2016). A non-negative matrix factorization method for detecting modules
+in heterogeneous omics multi-modal data. Bioinformatics, 32(1), 1–8.
+https://doi.org/10.1093/bioinformatics/btv544
+
+
+ Stochastic optimization for PCA and
+PLS
+ Arora
+ 2012 50th Annual Allerton Conference on
+Communication, Control, and Computing (Allerton)
+ 2012
+ Arora, R. (2012). Stochastic
+optimization for PCA and PLS. 2012 50th Annual Allerton Conference on
+Communication, Control, and Computing (Allerton),
+861–868.
+
+
+
+
+
+
diff --git a/joss.05664/10.21105.joss.05664.jats b/joss.05664/10.21105.joss.05664.jats
new file mode 100644
index 0000000000..4cddbc53c4
--- /dev/null
+++ b/joss.05664/10.21105.joss.05664.jats
@@ -0,0 +1,478 @@
+
+
+
+
+
+
+
+Journal of Open Source Software
+JOSS
+
+2475-9066
+
+Open Journals
+
+
+
+5664
+10.21105/joss.05664
+
+dcTensor: An R package for discrete matrix/tensor
+decomposition
+
+
+
+https://orcid.org/0000-0003-3797-2148
+
+Tsuyuzaki
+Koki
+
+
+
+
+
+
+Department of Artificial Intelligence Medicine, Graduate
+School of Medicine, Chiba University, Japan
+
+
+
+
+Laboratory for Bioinformatics Research, RIKEN Center for
+Biosystems Dynamics Research, Japan
+
+
+
+
+27
+6
+2023
+
+8
+88
+5664
+
+Authors of papers retain copyright and release the
+work under a Creative Commons Attribution 4.0 International License (CC
+BY 4.0)
+2022
+The article authors
+
+Authors of papers retain copyright and release the work under
+a Creative Commons Attribution 4.0 International License (CC BY
+4.0)
+
+
+
+R
+discrete matrix factorization
+discrete tensor factorization
+dimension reduction
+
+
+
+
+
+ Summary
+
Matrix factorization (MF) is a widely used approach to extract
+ significant patterns in a data matrix. MF is formalized as the
+ approximation of a data matrix
+
+ X
+ by the matrix product of two factor matrices
+
+
+ U
+ and
+
+ V.
+ Because this formalization has a large number of degrees of freedom,
+ some constraints are imposed on the solution. Non-negative matrix
+ factorization (NMF) imposing a non-negative solution for the factor
+ matrices is a widely used algorithm to decompose non-negative matrix
+ data matrix. Due to the interpretability of its non-negativity and the
+ convenience of using decomposition results as clustering, there are
+ many applications of NMF in image processing, audio processing, and
+ bioinformatics
+ (Cichocki
+ et al., 2009).
+
A discrete version of NMF can also be considered by imposing a
+ binary solution (e.g., {0,1}) for the factor matrices extracted from
+ the data matrix and it is called binary matrix factorization (BMF)
+ (Z.
+ Zhang et al., 2007). BMF is recently featured in some data
+ science domains such as market basket data, document-term data, Web
+ click-stream data, DNA microarray expression profiles, or
+ protein-protein complex interaction networks.
+
Although BMF is becoming more used, in the current data analysis,
+ further extensions are required. For example, we may need a ternary
+ solution (e.g., {0,1,2}) instead of a binary one. Here, I call it
+ ternary matrix factorization (TMF). TMF would contribute to the
+ extraction of ordered patterns, such as stages of disease severity. It
+ is also possible to apply the discretization to only one of the two
+ factor matrices (
+
+ U
+ or
+
+ V)
+ and here I call it semi-binary matrix factorization (SBMF)
+ (Ma et al.,
+ 2021) or semi-ternary matrix factorization (STMF). This
+ extension contributes to the extraction of discrete patterns in
+ continuous-valued matrix data. Finally, there is a growing demand to
+ extend MF to the simultaneous factorization of multiple matrices or
+ tensors (high-dimensional arrays)
+ (Cichocki
+ et al., 2009). Such heterogeneous data sets are obtained when
+ multiple measurements with a common data structure are performed under
+ different experimental conditions. Therefore, it is very convenient if
+ discretization is available to such heterogeneous data structures. To
+ meet these requirements, I originally developed
+ dcTensor, which is an R/CRAN package to perform
+ some discrete matrix/tensor decomposition algorithms
+ (https://cran.r-project.org/web/packages/dcTensor/index.html).
+
+
+ Statement of need
+
There are some tools to perform BMF such as
+ Nimfa, libmf,
+ recosystem, and
+ Origami.jl but there is no implementation to
+ perform TMF, SBMF, STMF, or extensions of MF to multiple matrices or
+ tensor. For this reason, I originally implemented such discrete
+ matrix/tensor decomposition algorithms in R language, which is one of
+ the popular open-source programming languages.
+
dcTensor provides the matrix/tensor
+ decomposition functions as follows:
+
+
+
MF against a matrix data
+
+
+
dNMF: Discretized Non-negative
+ Matrix Factorization
+ (Cichocki
+ et al., 2009;
+ Lee
+ & Seung, 1999)
+
+
+
dSVD: Discretized Singular Value
+ Decomposition
+ (Tsuyuzaki
+ et al., 2020)
+
+
+
+
+
MF against multiple matrices data
+
+
+
dsiNMF: Discretized Simultaneous
+ Non-negative Matrix Factorization
+ (Badea,
+ 2008;
+ Cichocki
+ et al., 2009;
+ Yilmaz,
+ 2010;
+ C.-C.
+ Zhang S. Liu et al., 2012)
+
+
+
djNMF: Discretized Joint
+ Non-negative Matrix Factorization
+ (Cichocki
+ et al., 2009;
+ Yang
+ & Michailidis, 2016)
+
+
+
dPLS: Discretized Partial Least
+ Squares
+ (Arora,
+ 2012)
dNTD: Discretized Non-negative
+ Tucker Decomposition
+ (Cichocki
+ et al., 2009;
+ Kim
+ & Choi, 2007)
+
+
+
+
+
+
+ Example
+
For the demonstration, here I show that SBMF can be easily
+ performed on any machine where R is pre-installed by using the
+ following commands in R:
+ # Install package required (one per computer)
+install.packages("dcTensor")
+
+# Load required package (once per R instance)
+library("dcTensor")
+library("nnTensor")
+library("fields")
+
+# Load Toy data
+data <- toyModel("NMF")
+
+# Perform SBMF
+set.seed(1234)
+out <- dNMF(data, Bin_U=1E+6, J=5)
+
+# Reconstruction of the data matrix
+rec.data <- out$U %*% t(out$V)
+
+# Visualization
+layout(rbind(1:2, 3:4))
+image.plot(data, main="Original Data", legend.mar=8, zlim=c(0, max(data)))
+image.plot(rec.data, main="Reconstructed Data", legend.mar=8, zlim=c(0,max(data)))
+hist(out$U, breaks=100)
+hist(out$V, breaks=100)
+
+
Semi-binary Matrix Factorization
+ (SBMF).
+
+
+
In the top left of
+ [fig:sbmf], we can
+ see that the demo data has five significant patterns as blocks. In the
+ top right of
+ [fig:sbmf], we can
+ see that the reconstructed data, which is the matrix product of the
+ factor matrices
+
+ U
+ and
+
+ V,
+ also has the same patterns and this means the optimization of SBMF is
+ properly converged. In the bottom left of
+ [fig:sbmf], we can
+ see that
+
+ U
+ is binary ({0,1}), but
+
+ V
+ is not (the bottom right of
+ [fig:sbmf]), which
+ means the solution is semi-binary. This solution is imposed by setting
+ a large value against Bin_U argument in dNMF function, which is the
+ binary regularization parameter for
+
+ U.
+ dNMF also has Bin_V argument, which is the binary regularization
+ parameter for
+
+ V.
+ Setting large values against Bin_U and Bin_V, BMF can also be
+ obtained. Likewise, the ternary solutions (TMF and STMF) can be
+ obtained by ternary regularization parameters such as Ter_U and
+ Ter_V.